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Abstract 

We describe RAGE, the "Radiation Adaptive Grid Eulerian" radiation-hydrodynamics 
code, including its data structures, its paralleUzation strategy and performance, its hydro- 
dynamic algorithm(s), its (gray) radiation diffusion algorithm, and some of the considerable 
amount of verification and validation efforts. The hydrodynamics is a basic Godunov solver, 
to which we have made significant improvements to increase the advcction algorithm's ro- 
bustness and to converge stiffnesses in the equation of state. Similarly, the radiation trans- 
port is a basic gray diffusion, but our treatment of the radiation-material coupling, wherein 
we converge nonlinearities in a novel manner to allow larger timesteps and more robust 
behavior, can be applied to any multi-group transport algorithm. 



*Science Applications International Corp., MS A-1, 10260 Campus Point Drive, San Diego, CA, 92121 
tLos Alamos National Laboratory, MS T087, PO Box 1663, Los Alamos, NM 87545 
*TaylorMade-adidas Golf, 5545 Fermi Court, Carlsbad, CA 92008-7324 



Introduction 



RAGE, an acronym for Radiation Adaptive Grid Eulerian, is a multidimensional (ID, 2D, 3D), 
multi-material, massively parallel, Eulerian hydrodynamics code for use in solving the Euler 
equations coupled with a radiation diffusion equation (gray or multigroup) in a variety of high- 
deformation flow problems. It uses computational cells of unit aspect ratio (squares in Cartesian 
ID, 2D, 3D; tori in cylindrical ID and 2D, and shells in spherical ID). The mesh is refined (or 
unrefined) locally every computational cycle by an Adaptive Mesh Refinement (AMR) algorithm 
that looks at local spatial variation of physical quantities (pressure, density, material boundaries, 
etc.) and then bisects or recombines cells so as to maintain a desired degree of resolution. 
RAGE is intended for general applications without tuning of algorithms or parameters, and is 
portable enough to run on a wide variety of platforms, from desktop PC's (Windows, Linux, and 
Macintosh) to the latest MMP and SMP supercomputers. 

After a brief historical review, the next section will discuss methods of generating meshes 
from various geometry descriptions and will introduce certain data-structure concepts, while 
section 2 will discuss our parallel-centric data structures in greater depth, because they allow 
us to scale to kilo-pe's without interfering with the physics. Section 3 will discuss the adaption 
logic briefly, because it uses a very conservative {i.e. mesh-profligate) algorithm, whose only 
merit is that all material boundaries and sharp gradients will only be found at the finest level 
of the mesh. 

Because our Eulerian hydrodynamic algorithm has more of a Lagrangian flavor than most 
Eulerian codes, section 4 will describe the relation between the Riemann solution and the ad- 
vection in detail. Our equation of state lookup techniques will be briefly discussed in section 5, 
and the logic that is required to modify the Riemann solver to handle self-gravity is covered in 
section 6. 

rage's novel numerical treatment of the energy coupling between matter and radiation is 
covered in section 7. The exponential differencing of the heating term allows a smooth transition 
to equilibrium diffusion. The code uses an iterative solution to the nonlinear terms in this part 
of the equations, allowing larger timesteps and less computer time. 

We conclude with a brief discussion of some of the work done at Los Alamos National Lab- 
oratory to verify the correctness of the algorithms and validate the appropriateness of those 
algorithms for a broad range of problems of interest to that community. 

A turbulence model and a multifluid hydrodynamic treatment are under development at 
LANL, and work has also begun on improved AMR methods. We intend to discuss these and 
other work (e.g. multigroup radiation, ionization physics) in future articles. 

Historical Background 

The SAGE (SAIC Adaptive Grid Eulerian) code is a multidimensional (ID, 2D, 3D), multima- 
terial, massively parallel, Eulerian hydrodynamics code for use in solving the Euler equation 
(with optional thermal conduction) in a variety of high-deformation flow problems, begun by M. 
Gittings in 1990 to support DNA/DTRA work [1,2]. 

The RAGE code is built on top of SAGE by adding various radiation packages (gray diffusion 
and multigroup diffusion packages were written by T. Betlach and N. Byrne as an Internal 
Research and Development project in 1990-1991 [3]) and is used for problems in which radiative 
transport of energy is important. 

RAGE is the result of a long prior evolution from earlier incarnations as a modeling code 
for ground-effects experiments at the Nevada Test Site and for underwater blast-effects. It was 
brought to Los Alamos National Laboratory in 1995 in order to develop an AMR Eulerian code 
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suitable for problems of interest to that community. The original code was written for the 
supercomputers of the early 1990's: Cray vector processors. We found that these machines were 
not fast enough to complete 3D simulations (even rudimentary ones) in any reasonable amount 
of calendar time; based on 2D runs that were done at this time, we estimated that 3D runs 
would take more than 60 years to complete on the CRAY architecture. 

With the advent of the Department of Energy's ASCI (Advanced Simulation and Computing 
Initiative) program^ in 1996, LANL and SAIC began a joint effort to convert the original CRAY 
vector Fortran- 77 code to a new paradigm: parallel and modular code via effective software 
quality engineering. We chose to employ "vanilla" Fortran-90/95 with C-language interfaces 
for I/O and the explicit use of message passing to effect communication among the processors. 
Domain decomposition was used to spread the computational cells of the problem among the 
processors. 

The massively-parallel version of the codes was designed to run extremely large problems in 
a reasonable amount of calendar time. Our target is scalable performance to 10,000 processors 
on a 1 billion cell problem that requires hundreds of variables per cell, multiple physics packages 
{e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. Currently, the 
largest simulations we do are three-dimensional, using over 600 million computation cells and 
running for literally months of calendar time using 2000 processors. 

SAGE itself provided an early testbed for development of massive parallelism as part of the 
ASCI program and has been shown to scale well to thousands of processors [4]. 

1 SETUP and the RAGE Computational Grid 

The RAGE computational grid is set up by specifying a coarse uniform mesh in the appropriate 
number of dimensions, D, and then refining that mesh according to various criteria. Each 
processor is required to have a multiple of 2^ zones at setup, which allows us to treat our base 
grid as if it too were composed of adapted cells (i.e. no special run-time coding is required for 
this coarsest level of cells) . For efficiency, that multiple should be on the order of thousands, or 
circa 50k zones/pe. 

The RAGE computational grid is an octree decomposition of the model space (i.e. adaption 
bisects each zone in each dimension) . The tree depth is selected to resolve field gradients within 
a material to one user-specified resolution and material boundaries to a second user-specified 
resolution. Because RAGE allows a mixture of materials in any cell, it is not necessary to 
subdivide the cells to the size of the smallest geometric features - if a zone is not mixed on cycle 
zero, it soon will be. We refer to the depth of the subdivision required to attain the requested 
resolution as the "level" of the grid. The 2:1 edge length ratio of mother to daughter cells is 
enforced as the maximum for any adjacent cells, whether or not they have a common ancestor. 

The only geometric query that RAGE grid generation requires^ is a "point containment" test, 
which determines if a specified region contains a query point. The user assigns a priority value 
for each geometric region to be gridded (normally, just the order in which regions are defined). 
When the grid generation algorithm queries to determine which geometric region contains a 
query point, each region is tested in order from highest to lowest (but one) priority. The first 
region for which the point containment test returns inside is returned as the containing region. 
This process assures that all assembly gaps and voids within the model are assigned a material, 
since regional covers the entire mesh. 

^now known as the ASC (Advanced Scientific Computing) program. 

^One other query is required to return the material properties - density, temperature, etc. - once the grid is 
built. 
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Figure 1: RAGE adaptive mesh refinement is limited to a 2:1 ratio - adjacent 
cells may not differ by more than one level. 

The problem domain is subdivided into as many subdomains as the number of processors 
assigned to the problem. The grid for each subdomain is then generated independently, except 
for two adjustments that require occasional interprocessor communication: 1) load balancing, 
and 2) ensuring the edge length ratio never exceeds 2:1. 

With the mesh structure fixed, we approximate the material properties within the finest cells 
("leaves") of the octree by using multiple queries in the spirit of a Monte Carlo integral over the 
distributions of density and temperature (or pressure and temperature, or density and internal 
energy - the three alternate ways to specify the initial state of a region). 

1.1 Grid Generation via Geometric Queries 

Since RAGE only requires a point containment test, a wide variety of geometric modeling tech- 
niques are feasible. Geometric support is currently divided among SAGE primitives, an inte- 
grated geometry query library, Spica, and a Lagrangian link file query library, Merak. 

1.1.1 SAGE Primitives 

The intrinsic geometry specification support includes spherical, ellipsoidal, conical frustum, tri- 
angular prism and quadrilateral prism regions, as well as perturbed boundary regions (sine and 
cosine perturbations), and radially swept 2D regions (used, e.g., for overlaying a 1-d spherical 
dump into a 2-d cylindrical mesh). Use of the intrinsic geometric modeling involves describ- 
ing a physical problem within the input file and then verifying through RAGE execution that 
the desired model has been achieved (various graphics options allow the user to examine the 
"cycle zero" mesh and properties before starting the first physics cycle). These geometric prim- 
itives combined with judicious application of the RAGE region-priority ordering scheme permit 
a surprising number of complex simulations. 
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1.1.2 The Spica Library 

The Spica library allows RAGE to generate grids from OSO geometric regions, OSO being a 
Los Alamos supported geometric modeling program [5]. Geometry created in (or imported into) 
OSO and used for initial grid generation, includes CSG combinations of triangular facetted 
closed b-rep surfaces (which in turn are imported into OSO as stereo lithography or "STL" 
files), solid representations, and implicit surfaces. OSO supports importing data from several 
geometric modeling and commercial CAD packages. 

Spica currently accepts STL files as input, simply because many packages write that format; 
many other equivalent formats can easily be translated to STL. The solid geometry description 
nodes contain a variety of primitive objects: box, cone, cylinder, sphere, parallelepiped, torus, 
and a surface of revolution generated by a piecewise linear curve. These primitives are composed 
into regions using OSO. While it can be challenging to create extremely detailed models using 
these primitives, there is a considerable query speed advantage to doing so. OSO also supports 
STL files as first class primitives that can be translated, rotated, scaled, and combined with the 
previously specified primitives to compose regions. In practice, combinations of all the geometry 
description node types are used in computational models. 

Separating the region selection from the geometry description is useful because it moves 
the exact definition of inside or outside to the tree, rather than leaving it up to the geometry 
generation package, with the advantage that all these CAD/CAM descriptions can be used in 
RAGE at the minimal cost of the query routine. 

Figure 2 shows an internal combustion engine model that was created in Pro/ENGINEER, 
output as separate STL files for each region of the assembly, and collected for viewing, scaling, 
positioning, and rendering by OSO. The OSO dataset was then processed via the Spica routines 
while reading the RAGE input file. 
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Figure 2: OSO rendering of an internal combustion 
engine model created in Pro/E. 



Figure 3: Close-up of a cross section of 
a RAGE grid generated from the Pro/E 
model. 



Figure 3 presents a close-up of a cross section of a RAGE grid generated from the Pro/ENGINEER- 
OSO model. The boundaries of the cylinder block, head, valve springs, and retainers are modeled 
at level-5. Note the gradual transition from level-1 to level-5 caused by the enforcement of 2:1 
edge length ratios between adjacent cells. This grid has been restricted to five levels for purposes 
of illustration; typically, a well resolved computational grid will descend to level-8. 

One and two-dimensional meshes can be set up from a cross-section of a 3-d model, and 3-d 
models can be created by rotating cylindrically symmetric 2-d models. 
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2 Data Structures and Parallel Implementation 



In a parallel environment, RAGE is a MIMD (Multiple Instruction Multiple Data) code with 
synchronization at points in the instruction streams where inter-processor communication occurs. 
Such communication is handled at the lowest level by a machine-specific implementation of the 
MPI (Message Passing Interface) library, and at the application code level by a set of RAGE 
routines (the "token library" ) that sits atop the machine-specific library and hides most of the 
details of inter-processor communication from the application code. Although the code is mainly 
designed with a distributed-memory multiprocessor architecture in mind, it also runs on shared- 
memory machines and indeed on uniprocessor machines, including Mac's and PC's. 

2.1 Computational Cells and Cell Faces 

Cells are labeled by a single index, I, regardless of dimension. The highest level of refinement 
may change as the simulation proceeds, and in addition, cells may be coarsened, i.e., a previously 
refined block of two, four or eight cells may be recombined into a single larger cell. Spatially ad- 
jacent cells are not allowed to differ by more than one level of refinement. If changing conditions, 
e.g., the passage of a shock front, require local refinement by more than one level, the refinement 
process is made to propagate outward to satisfy this requirement. Also, the refinement algorithm 
"looks ahead" so that it can, for example, anticipate the arrival of a moving shock front; this 
tends to put the most intense action into a locally uniform subgrid at the locally highest level 
of refinement. 

At problem setup, blocks of two, four or eight level- 1 cells are assigned index values in a 
linear sequence along a compact (Hilbert) space-filling curve [6], [7] that covers the computational 
domain, or (by default) in Fortran column-major order (we were surprised to find that on some 
architectures, this naive partitioning scaled better than the fancier space-filling curve). Cells 
are then allocated to processors by means of a simple partitioning of this curve. This produces 
compact spatial processor-subdomains and at the same time a linear global ordering of the cells. 
It also approximately minimizes the processor surface-to- volume ratio (i.e., the number of cells 
that share a face with a cell that resides on another processor divided by the number of cells 
that do not — roughly equivalent to the communication-to-computation ratio). Subsequent cell 
refinement, either at problem setup or later during the simulation, proceeds via the creation of 
new blocks of daughter cells. The newly created cells are inserted into the existing index sequence 
following the last cell in the block containing the appropriate mother cell; this preserves a degree 
of spatial locality in the cell ordering. It also allows us to enforce the rule that a cell and its 
siblings (cells that share the same mother cell) are always assigned sequential cell indices and 
that they always reside on the same processor - during the cycle the daughters are created; there 
is no guarantee they stay together on subsequent cycles after further adaption. The price that is 
paid for this locality-preserving indexing is that the cell index-set must be rebuilt at the end of 
every computational cycle, as part of the cell refinement process. Aside from the specific cases 
that have just been enumerated, there is no necessary connection between a cell's index and its 
spatial location in the mesh, so that a relatively large amount of indirect addressing is required. 

Active or "leaf" cells are cells at the finest local level of refinement, i.e., cells with no daugh- 
ters. The physics algorithms usually involve only the active cells, although the self-gravity 
package's Poisson solver does use inactive cell information to construct its multipole elements 
(see section 6). In addition to the cell arrays, the code constructs face arrays to be used in the 
implementation of the finite-volume physics algorithms; for example, diffusion coefficients are 
evaluated on cell faces. Faces are called active if they connect two active cells. If a cell face 
lies on a processor boundary, the face is assigned only to one of the two processors, in order 
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to avoid double counting in code that loops over faces. Most cell arrays for physical quantities 
persist between computational cycles, except for indexing changes due to mesh refinement and 
load balancing operations, and are written out to restart dump files; face arrays, in contrast, are 
rebuilt anew at the beginning of each cycle. 

Load balancing is not done every cycle, but only when the distribution of cells among proces- 
sors has gotten out of balance by more than some preset threshold. The load balancing operation 
occurs via a simple repartition of the global cell index set, followed by the necessary movement 
of cell data among processors. 

2.2 Communication 

Coiniminication in RAGE involves movement of data between adjacent cells, between cells and 
cell- faces, and between cells at different levels (mothers to daughters and daughters to mothers). 
In a multiprocessor environment, this will involve movement of data between processors. RAGE 
provides a small set of routines that execute these communication operations in such a way 
that the details of the inter-processor communication are hidden, as much as possible, from the 
application code. 

Much of the inter-processor communication at the applic;ation c;oclc level employs the RAGE 
"clone" construct. This is a kind of ghost-cell mechanism, where clones are local (on-processor) 
copies of cells that reside on other processors and share a common face with a cell on the local 
processor. 

The following example shows how the communication process looks from the perspective of 
the application code. In this example, we compute the divergence of a heat conduction flux, 
F = —DVT, integrated over each active cell, using Gauss's theorem to convert the volume 
integral to a sum over faces of the normal component of the flux at each face. In the following 
pseudocode fragment, (Numcell) is the number of active cells that reside on the local processor, 
N+ (Numcell_clone) is equal to N plus the number of cells cloned from adjacent processors, 
T(l : A^_|_) is the temperature array on the local processor, and divF(l : A^+) is an array to hold 
the computed result, i.e., the integrated divergence of the flux on the local processor. The 
arrays T and divF are dimensioned to have elements: elements (1 : A'^) for local cells (those 
that reside on the local processor), and elements (A' -|- 1 : A^+) for the clones (corresponding to 
cells on adjacent processors). The example contains three typical steps, namely, a gather step 
(clone _get), a compute step (the loop over cell-faces), and a scatter step (clone.put): 

call clone _get(T) ! Get off-processor temperature data 

! i.e. T(Numcell+l :Numcell_clone) 
divF(l :Numcell_clone) = 0.0 ! Initialize all elements of divF 

do (for all active on-processor faces) ! in all directions 

! llo = local index of cell or clone on the lo-side of face 
! Ihi = local index of cell or clone on the hi-side of face 

divF(llo) = divF(llo) . D-face( ^ a.^^^^^ Ax'lM) )/2 )^-'-'' 
divF(lhi) = divF(lhi) - D_face(^^^TilM)_=^^gL^)^ 

enddo 

call clone_put("add" ,divF) ! Scatter data to adjacent processors 

! cumulate into on-processor elements 
divFd : Numcell) = divF(l : Numcell) /Volume (1 : Numcell) 
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Here llo and Ihi refer, respectively, to the cell or clone on the low-side and to the cell or 
clone on the high-side of the current cell- face in the loop over faces. The map between the active 
local cell faces and the adjacent active cell or clone indices llo and Ihi is built anew at the 
beginning of each computational cycle, together with the associated machinery used by the two 
communication routines clone _get and clone_put. Dface is the diffusion coefficient evaluated 
at the current cell face via an appropriate interpolation using data from the adjacent cells (e.^., a 
harmonic or geometric average) and A face is the area of the face. Ax/o and Ax^i are the sizes of 
the cells referred to by llo and Ihi. Once the off-processor contributions have been cumulated 
into the on-processor data via the clone_put, they are divided by the volume of the cell in 
order to ensure that divF has the correct dimension to be the divergence of a flux (V • F). The 
calculation of the gradient, by using half the sum of the widths of the cells {/S.xuo + ^xihi) allows 
this expression to be used without modification at the edges of a periodic boundary, whereas 
using the difference of the positions {xihi — xuo) of the two zones would generate nonsense in 
that case; otherwise, the two expressions would be equivalent. 

The clone data structure is built by the token library routines at the beginning of the compu- 
tational cycle and includes a list of the off-processor cells adjacent to the boundary of the local 
processor, a list of the on-processor cells adjacent to the local processor boundary, and buffers 
to hold corresponding data to be sent to and received from other processors. 

2.3 Parallel I/O 

When it is necessary to write (or read) restart dump files, RAGE can use either the MPI-IO or 
specialized bulk I/O libraries at the lowest level. At the code developer's level, a subroutine 
is invoked to write either a scaler or vector which may be the same or different on different 
processors. Multidimensional mesh-arrays, e.g. cell_momentum(l :nmncell, 1 : nmndim) , are 
written one mesh-array at a time with an outer loop(s) over the non-mesh index(es): 

call pio_io ( ' Scune ' , 'write', 0, 'do_hydro', do_hydro, pio_err) 
do n = 1 , numdim 

call pio_io( 'dif f ' , 'read', n, 'vel', vel( : ,n) ,numcell,pio_err) 
enddo 

The PIO package has both serial and parallel I/O capabilities. Serial I/O is the default, although 
a flag allows one to switch to MPI I/O or another parallel I/O package called bulkio which is based 
on asynchronous MPI calls that move data to a set of I/O processors. Both packages can be tuned 
for optimal performance using input file parameters, e.g. striping factors, configuration lists, etc. 
The serial I/O package uses MPI sends and receives to move the data from each processor to the 
I/O processor, overlapping the writing of one processor's data with the receiving of another's. 

When parallel I/O is activated, the lengths of the array on the individual processors are 
gathered and cumulated so that each processor knows at what location in the dump file"' to 
begin processing its contribution to the total. All of this data management and the calls to 
specific I/O library routines occur at a low level that the (physics) code developer never sees. 
In the case that the data is the 'same' or when serial I/O is selected, only the I/O processor 
(usually pe=0) does the actual work. Regardless of the type of I/O, a return flag {e.g. pio_err, 

^The original parallel I/O package was written by Howard Pritchard of SGI for the ASCI Blue-Mountain 
machine; a more portable version was written by Pat Fay of Intel to also run on the ASCI-Red machine, and it 
was ported to the ASCI-Q machine by Lori Pritchett of HP. 

*We choose to write a single disk image, for the convenience of the users, as well as for the flexibility to write 
with one number of processors and read with a different number. 
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above) indicates whether or not the data has been successfully read into the array or written 
onto the disk. 

Restart dump files are formatted to be valid Scientific Data Sets in the HDF conventional 
terminology; everything but character strings are written as IEEE 64 bit Reals and characters 
are written as ASCII character strings. On input, RAGE swaps the bytes on the fly when it 
detects that the "endian-ness" of the machine is different from the big or little "endian-ness" of 
the file it is reading, as well as converting Cray 64 bit Reals to IEEE 64 bit Reals when such 
dumps are encountered. 

2.4 Parallel Scaling Performance 

In running simple timing problems on various architectures (e.g. hydro-only or conduction-only 
on a fixed level mesh'), we have found relatively good scaling with processor count, in the sense of 
constant work (fixed zones) per processor, as shown in figure 4. The metric used for performance 
is the processing rate in terms of the number of cell-cycles that are processed per second on each 
processor, and is the inverse of grind-time that is often referred to. Almost a decade of different 
systems are shown in figure 4, the oldest being the ASCI Red system installed at Sandia National 
Laboratory in 1996 and the newest, the Blue Gene/L system, installed at Lawrence Livermore 
National Laboratory in 2005. The performance over this time has increased by approximately a 
factor of 20 resulting from both improvements in processor speeds, and network speeds. 

Ideal scaling performance is a straight horizontal line at the level of the performance of a 
single processor when a constant amount of work is mapped to each processor; very good scaling 
performance of SAGE is observed on these large-scale systems. The performance on almost all 
of these systems degrades only by about a factor of 2 at 1,000s of processors. The degradation 
results from the need for communication between logically neighboring processors as well as the 
need for collective operations. Note that the performance was measured on the largest sized 
systems installed except for the Blue Gene/L which will be upgraded from the 2K processors 
indicated in figure 4 to 128K processors during 2005 

Further details on the performance of SAGE on the ASCI White (IBM), ASCI Blue Mountain 
(SGI), and T3E (Cray) systems can be found in [8]. Details on ASCI Q (Compaq) performance as 
well as how SAGE was actually used in the optimization of the system performance is described 
in [9]. The performance of Lightning (Linux) has been described in [10], and early details on the 
performance of Blue Gene/L (IBM) is given in [11]. 

3 Adaptive Mesh Refinement Overview 

The adaptive mesh refinement (AMR) method used in RAGE was initially developed to solve 
multidimensional, time dependent shock hydrodynamics problems. The particular problem of 
interest at the time required high resolution at a shock front that could be kilometers away from 
other resolved features in the mesh. In choosing a method, it was desirable to have rectangular 

^The scaling problems are 3-d, consisting of three nested blocks of 7 = 7/5 ideal gases at different temperatures 
and pressures (p = 0.1,10,0.1 g/cm^; T = 10^,0.1,0.1 eV; p ~ lO^^*, lO" , lO" erg/cm^, side length -La;,i/,z = 
30, 15,6 cm respectively), using the same number of fixed zones on each processor. The same logic is executed 
in all zones whether the adverted quantities will be zero or non-zero. The problem can be run in hydro only 
mode (as shown in Figure 4, or in a pure conduction mode, not shown but giving similar scaling results - each 
face calculates a conductivity, and adds elements to a row and column of the matrix for subsequent inversion). 
Because of indirect addressing, the work per zone is the same whether zones are adapted or not, and it would be 
extremely difficult to force the same amount of adaption on every processor for a scaling study. 
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Figure 4: The scaling behavior of SAGE on a variety of systems in weak-scaling mode (constant 
work per processor). 

(square) cells where integration methods with well understood convergence properties could be 
used. 

A primary consideration was to ensure global conservation (of mass, momentum and energy); 
because the arbitrarily-orientated patch-based methods such as those of Berger and Ohger [12] 
make conservation difficult, we chose to have the underlying coarse cell boundaries coincide 
exactly with the finer mesh boundaries [13], which greatly simplifies the process of maintaining 
global conservation. 

The "patch-based" AMR of Berger et al. is not, however, used in RAGE. Our "cell-based" 
AMR method is most similar to that of Young et al. [14] - it is a block-structured mesh with 
cell refinement limited to bisection in each dimension where adjacent cells differ by at most one 
refinement level. 

RAGE uses a Continuous Adaptive Mesh Refinement (CAMR) algorithm; it is continuous 
in space and time in the sense that mesh refinement occurs on a cell-by-cell and cycle-by-cycle 
basis. The mesh is evaluated for refinement on every cycle throughout the entire domain. The 
overhead associated with this CAMR technique has been empirically determined to be about 
20% of the total runtime, which is small compared to the gains of using CAMR instead of a 
uniform mesh. 

The CAMR algorithm has two phases. The first or setup phase performs refinement at the 
start of a problem and the second or dynamic phase performs refinement during each timestep. 
It is convenient to separate the algorithm in this way to allow for different refinement criteria in 
each phase. 
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In the setup phase the base grid is refined by checking whether a cell in the grid is composed of 
a pure material or contains an interface. It is frequently desirable to allow material interfaces to 
be refined to a difl^erent maximum level than in the bulk interior of the material {e.g. during the 
passage of a shock). In the case of RAGE, material interfaces are determined either by changes 
in density or changes in material volume fraction as determined by the user. Additionally, if 
a cell and its neighbor cell differ significantly in density, pressure, or velocity component then 
refinement is performed on that cell. The user can also specify levels of refinement at predefined 
points or regions. 

Once the grid has been generated and initialized, the governing equations are integrated 
one timestep and the entire mesh is evaluated for refinement before any data is moved. This 
evaluation is a two-part process and occurs on every cycle for every cell, not just the top level 
cells. The reason for this will be discussed shortly. The first part of the process determines 
which cells are to be fiagged for refinement, coarsening, or neither. The second part of the 
process handles the addition of new cells, the removal of others, and their associated linkage 
to neighboring cells. In RAGE, activity tests are performed to refine the mesh ahead of shock 
waves. These activity tests use small changes in velocity and pressure to trigger refinement. Cells 
are also refined based on first-order estimates of truncation error in a Taylor series expansion of 
cell properties. Because error estimates for several cell properties are considered, the maximum 
error is saved for a cell. This method favors higher refinement in the interest of better accuracy. 

When all cells have been evaluated, the second, data-movement, phase takes place. Cells are 
added and removed according to how they are fiagged in the first phase. However, there are two 
rules that can override that phase: (i) no cell is allowed to be more than one level different from 
its neighbor (including diagonal neighbors) , and (ii) cells will only be removed if it is unlikely 
that in the following cycle that cell would be refined. The last condition is accomplished by 
basing cell coarsening on the error estimate in the mother cell as well as the daughters; thus the 
need to update mother cells as mentioned above. The mother cells are used because they would 
become the top level cells on the next cycle if the cell was coarsened. Algorithmically it is easier 
to update all cells rather than restrict the update operation to top-level cells and their mother 
cells. The number of cells unnecessarily updated ( "grand- mothers" ) is a small fraction of the 
total. 

When cells are added to the mesh, they are added in groups of known quantity, therefore 
only a reference to the first daughter cell needs to be stored, since the remaining cells follow in 
memory. Inter-processor communications are minimized by inserting the newly created daughter 
cells immediately after the mother cell in the global cell list. Load leveling is only done on those 
cycles where an individual processor's cell count exceeds the average cell count by some tolerance 
(approximately 10 percent). 

4 Hydro 

The hydrodynamic package in RAGE uses a two-shock or single-intermediate-state approximate 
Riemann solver to calculate the impulse and work on each face of a cell, using an alternating- 
direction-explicit (ADX) [15] technique. The Riemann solution is used to determine the tra- 
jectory in the donor zone along which a Lagrangian mass particle would move through the 
unperturbed state (and possibly the "contact" state), in order to just reach the face at the 
end of the timestep. This trajectory distance is then used to integrate all quantities that will 
be advected from the donor zone, and allows us to finesse some details in the typical Eulerian 
Riemann-solver methods. 

For each direction, we test on overall "volume" changes within zones, re-evaluating the sound 
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speeds where appropriate and iterating the Riemann solver. This semi-impHcitization allows us 
to handle collapsing bubbles or strongly shocked zones even though the solver itself uses only 
a "weak shock" approximate Riemann solver. This goes beyond strong-shock or general Rie- 
mann solvers to allow a coupling from logically separated zones through their common neighbor, 
preventing unphysical results from occurring due to the independent updates. 

We will demonstrate that this method is 2"'''-order accurate in space and time on smooth 
problems and l**-order accurate on discontinuous or shock problems. In general, our philosophy 
has been to develop a method that has a demonstrable convergence rate under mesh refinement, 
and to then work at reducing the error at a given level of refinement . 



4.1 Conservation Equations 

The hydrodynamical equations solved in the hydro package are: 

dt 
d{pu) 



dt 

d{pE,r. 



+V • (pu) 
+V ■ (puu+P) 







= Pg 



dt 



-V • {pEraU + P • U) = pg ■ 



u 



(1) 

(2) 
(3) 



where RAGE stores extensive arrays of mass M — Jy pdV, momentum AfU = Jy pudV, and 
total energy MEm = Jy pEmdV = jy p{e~\- ^u^)c?y, with u the specific momentum (velocity), 
and Em the specific total energy. The treatment of gravity will be deferred to section 6; for the 
purposes of this section, we will assume g = 0. 

RAGE has a strength-of-materials package, which includes a simple failure model and devia- 
toric strain-dependent stresses {e.g. Steinberg-Guinan) , but for this paper, we will only consider 
isotropic stresses, P = pi. RAGE also has an operator-split heat conduction package that solves 
^^''dt"^ ~'~ ^ ' ("'^^^m) — 0; but this is infrequently used compared to the radiation diffusion 
package discussed in section 7. 

The conservation equations can be converted into a Lagrangian form which will be used 
to calculate U and P at the half-timestep, for input into the Harten-Lax-van Leer (HLL) [16] 
approximate Riemann solver: 



= 



= 



= 



dp 
It 

dU 

dt 

dErn_ 

dt 



1 



u 



-V • P 



V • (PU) or = 



de 
dt 



P 



Using Eqs. (4) and (6) and the thermodynamic identity. 



V -u 



p dp 
p'^ de 



= 



dP 

dt 



U 



(4) 
(5) 

(6) 

we can also write: 
(7) 



Aside from E^, and unless otherwise noted, in this section we will generally use the convention 
of upper case letters for Lagrangian-frame variables, and lower case for Eulerian-frame variables. 



4.1.1 Boundary Conditions 

The hydrodynamic package in RAGE has only two real boundary conditions available to the 
user: reflecting and outflow; it has a "treatment" to allow inflow. 
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The reflecting boundary condition (BC) is self-explanatory, and was the original BC in the 
code. The justification was that with AMR technology, one could use very large zones to put the 
boundaries far away from the region of interest, and adapt that region appropriately. All (normal) 
derivatives are set to zero in zones adjacent to a vacuum face, and all (normal) components of 
velocity are set to zero on such faces. 

Since then, we have implemented outflow boundary conditions, individually speciflable on 
the high and low faces of the mesh in each direction. In this case, zonal values of derivatives 
normal to the face, instead of being the limited average of the gradient on the far face and zero 
at the vacuum face {i.e. therefore zero), are just the value calculated on the face of the zone not 
adjacent to vacuum. The velocity of fluid on this face is calculated by the Riemann solver and 
is not forced to zero, as with reflecting BC's. 

In order to meet the demand for some type of in-flow boundary condition, we have developed 
a crude "freeze boundary" condition. In this case, the user specifies a geometric extent of the 
problem space in which all properties are to be frozen. At the top of the timestep, the list 
of zones falling in this space are recorded, along with the values of all quantities subject to 
advection (in those zones); the hydro then operates as usual, and at the end of the timestep, 
those zones are re-set to their beginning of timestep values. This has been adequate to handle 
the class of problems that have a constant inflow, and can be extended to allow simple temporal 
ramps (albeit only first order accurate in time). 



4.2 Limiters and Linear Reconstructions 



Although users have the choice of a number of different limiter methods with which to calculate 
gradients, the two most commonly used are the minmod and the modified van Leer. Slopes, s±, 
are calculated within a zone by first linearly interpolating (between neighboring zones) a "face- 
value" of the quantity on the faces normal to a given direction on the mesh. These face- values 
are then broadcast into cell-centered "hi" and "lo" arrays, at which point one can calculate the 
slope of the quantity between between the high side and the ccntroid of mass (s+) or between 
the centroid and the low low side (s_). 

The minmod limiter generates the most conservative slopes for reconstruction. 



^inininod 



minmod(s+, s_) 

^[(sign(s_) -|-sign(s+)] • min (|s_|, |s+|) , 



allowing monotonic discontinuities at a face between zones without introducing any subzonal 
saw-tooth patterns in the reconstructed field. This ensures that an approximate Riemann solver 
will satisfy the correct entropy conditions. 

The van Leer slope and limiter [17] can be defined in terms of a generalized minmod: 



'^vanLccr 



minmod (- 



(sign(s_) -|-sign(s+)) 



sign(s_) -I- sign(- 



■ min ( 



\Ts^\,\rs+\) 



where the average, is the 2"''-ordcr estimate of the slope between the three zones and 

van Leer's definition of this limiter has T = 2 {J- = 1 reduces this to the ordinary minmod). 
Our modified version uses = |, which van Leer [17] has noted gives about the limiting power 
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of his harmonic form, Sharm = [(sign(s_) + sign(s_|_)]. We have found that the use of the 

standard J- = 2 in this limiter introduces a small amount of ringing behind some shock fronts in 
problems we have run, while using ^ = | does not produce any such visible anomalies. We have 
also found that the problems associated with stronger-than-minmod limiters {i.e. the saw-teeth 
in the reconstruction) is accentuated in two-dimensional problems, even if it is imperceptible in 
one dimension, and these problems are alleviated by use of = |. 

Our goal is to have a spatially and temporally second order accurate hydrodynamic method, 
physics permitting. For such temporal accuracy, we use a half-step, whole-step method to 
advance the states for the Riemann solver. Second order spatial accuracy can be regarded as 
that accuracy that will exactly preserve linear solutions across the mesh, which means that a 
minmod limiter will converge to the same order as a van Leer limiter, although in general the 
van Leer offers improvements in accuracy at a given level of refinement. 

Linear profiles are constructed in each zone with monotonically limited slopes with the zonal 
average located at the center of mass of the zone. For example, 

p(r) p+ {r - rc„i)Vp , 

p{r)dV ^pV = M, 

V 



rp{r)dV = Mr^^ . 

iv 

Kinetic energy is calculated with a 3-point Gaussian quadrature, so that the worst case, S*''- 
order polynomial (in Id spherical geometry), J p(r)u^(r)dr, is integrated exactly, allowing us 
to always calculate an "exact" internal energy, 

ME^ r^p{r)u^{r)dr^ /M . 

The velocity, u{r), is determined by a two-step method. Beginning with the specific momentum, 
tP — MU/M, we calculate its gradient, Vu*^, and integrate an estimate of the momentum, 

p{r)u{r)dV = Mifi + / (r - rcm^S/ p\/u°dV . 
V Jv 

We then recalculate the velocity to minimize the error between this estimate and the known 
value of momentum, 



u=[MU- J{r- r„nfVpVu°dV]/M 



We have not found it necessary to converge the velocity any further, but we do calculate Vw for 
use later. 



4.3 Riemann Solver 

A single-intermediate-state approximate Riemann solution [16] takes values that have been re- 
constructed on the Left and Right sides of a cell's face {Ul, Pl,Ur, Pr), and combines them into 
the intermediate state, {U*,P*), shown for example in Fig. 5 lying between the lines Ul — Sl 
and Uh + Sr. This state is a weak (i.e. integral) solution to the momentum equation on both 
sides of a face (we use the "mass" coordinate, dm = pdx = pc^dt in the following formulae): 

dt f° dm(— -0- dt f^"'^"* dm(— ^] 

Ja J-(pc)r^t \dt dm] J„ J„ \dt dm] 
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Figure 5: One characteristic lies in the 
Left half plane. Fluxes depend on the Left 
(donor) and Intermediate (*) states. 
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Figure 6: All characteristics lie in the Right 
half-plane. Fluxes only depend on the Left 
(donor) state. 



On the assumption that a signal propagates from the face at a "signal speed" of pc into each 
cell, these two equations in two unknowns can be solved for the intermediate state: 

^* ^ {pc)lUl + {pc)rUb-{Pr~Pl) 

{pc)L + {pc)r ' ^ ' 

p* ^ {pc)lPr + {pc)rPl - {.Pc)l{.Pc)r{Ur - Ul) , . 

{pc)L + {pc)R ' ^ ^ 

where the impedances, pc, are the fastest signal in each zone [16], 



{pcY = J(poCo)2 + pomax(0,P*-Po) (l + ^-^) , (10) 



2 (Oo rfe / 

found by iterating Eqs. (9) together with the "L" and "R" versions of Eq. (10) to pressure 
convergence. We have not, however, found it necessary to perform such iterations, given the 
"anti-cavitation" logic described below in section 4.3.3, and use just the "weak-shock" solution 
for the intermediate state, solving Eqs. (8) and (9) with the zone-centered, beginning-of-timestep 
values of cl and cr. We do use P* in a single pass through Eq. (10) before our advection logic, 
because this (pc)* can be shown (via Rankine-Hugoniot relations) to be equal to poUs, where Us 
is the speed of the shock/rarefaction in the unshocked fluid (of density po). The characteristic 
labeled "Ul — Sl" in Fig. 5 (and Fig. 6) moves in the lab frame at a speed of Ul — Us{l) (similarly, 
the one labeled "[/« -I- Sr'' moves at Ur + U^^r))- 

We want to ensure that advection is done as accurately as possible, and use certain Lagrangian 
concepts to effect that end. Note that a Lagrangian test particle - in the absence of any gradients 
- will start at the position "—D" in Fig. 5, and will drift at a constant velocity (Ul) until it hits 
(or is hit by) the characteristic wave that sweeps past it, jerking it to a new drift velocity, U* , 
with which it will drift to the face of the zone at the end of the timestep. If we take our linearly 
reconstructed density profiles and integrate them between —D and 0, we will be able to advect 
the "exact" amount of mass that needs to be moved from the donor cell to the donee cell. 

The direct-Eulerian method in RAGE is a hybrid that uses a Lagrangian Riemann solution 
to infer the correct Eulerian fluxes, in particular, a "manifestly obvious" prescription for the 
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multi-material continuity equation's mass fluxes. Using a Riemann Solver in the solution of any 
set of (conservative) hyperbolic equations involves four steps [18]: 

• The calculation of interpolated profiles for the dependent variables, 

• The construction of time-centered left and right states, 

• The solution of the Riemann problem with those states to generate the evolved state, 

• the conservative differencing of the fluxes in the original equations. 

Rather than calculate the left and right states by averaging at t" over the domain that will 
be causally connected to the face [18, 19] during the timestep, we calculate point values on either 
side of the face: pl + lAx^Vpi and pn — ^AxrVph. As in van Leer [20], it is understood 
that I Az is actually the distance from the face-center to the centroid of mass. For T-cell faces, 
it really is Ax • Vp. These {p,u)± point values are then advanced to the half time using the 
Lagrangian expressions (Eqs. (5), (7)), so that the point value Riemann solutions calculated from 
them will be second-order accurate in time. 

RAGE goes beyond the standard Riemann solver logic to make a further attempt to include 
gradients of velocity and pressure in its reconstruction of the amount of mass to flux across 
a face. Clearly, if there were no gradients, the jump in velocity across the leftward traveling 
characteristic in Fig. 5 would always be U* — Ul- However, if there is a gradient in velocity, 
then our Lagrangian test particle's initial velocity will nominally be Ul — Dux, where is 
the gradient of velocity at the beginning of the timestep; it is reasonable to also assume that 
the final-state velocity will not be exactly U* either. Thus, we regard the Riemann solution as 
giving us "point values" at the half-time rather than time-averaged values, and make the further 
assumption that the velocity jump across the left-bound characteristic in Fig. 5 is a variable, 
Vj{f), where / is the fraction of the timestep that elapses before the Lagrangian particle-path 
hits the characteristic. We define Vj as: 

- u*{f)-uuf), 

- ([/q* + U;SLfAt) - {Ul + u^SLfAt) , (11) 
= U^ -Ul + f{U* - Ux)SLAt , 

where we denote the gradient in velocity at the beginning of the timestep with = Ux, and 
the gradient at the end of the timestep (i.e. the gradient of the Riemann solutions) with [/*. 
The characteristic Lagrangian speed, Sl, is calculated after solving our approximate Riemann 
problem, 

g ^ f-Ci iip*<PL, 
\—{pc)*l/pl otherwise, ' 

with (pc)* as in Eq. (10). The sign of Sr is positive, meaning that all relevant characteristics 
(those in a donor cell) move at "U + S" . 

These expressions for U{f) in Vj{f) are not immediately obvious: in principle, the shock or 
the tail of a rarefaction moves at a speed of U* + Sl, while the head moves at Ul + cl- In order 
to be consistent with the single-intermediate state approximation, we assume that the distance 
from the contact to the tail is SlJ At (measured from the contact moving at U*), and we assume 
that the distance from the face to the head of the rarefaction is also SlJ At (measured from the 
face in the unshocked fluid frame moving at Ul)- Thus U*{fAt) and UL{fAt) at Eq. (11) and 
the jump across them. 
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If there were no gradients in the problem, then the characteristic in the donor cell will be 
moving at speed Ul + Sl, and the Lagrangian point will have been drifting from the coordinate, 
—D, with a velocity of Ul- Their intersection at a given time gives a constraint, —D + ULf^t — 
{Ul + SL)f^t- It will be noted that the actual velocity of the fluid drops out of this constraint 
to give, —D — SLf^t. When gradients exist, the characteristic, Ul + Sl, ought to show some 
curvature, but we assume that it is locally straight at the point of intersection, traveling at some 
unspecified X + Sl- We have a second constraint from the definition of the total distance that 
Lagrangian point moves to get to the face. In general we have 

-D + XfAt-{X + SL)fAt = 0, (12) 
-D + {UL-Du,)fAt+{UL-Du, + v,){l-f)At = 0, (13) 

which simplifies to 

-D~SLfAt - 0, (14) 
-D{l + u^At) + ULAt + Vjil - f)At = 0. (15) 

Solving the second equation for D, and plugging it into the first, we have an equation for / 
alone: 

r ^ ^ ^ -iUL+Vj{l-f)) 

^ SLAt SL{l + u,At) ' 

^0 = UL + v,{f) + f[SL{l + u,At)-v,{f)]- (16) 

Once this equation is solved for / {e-g- by Newton-Raphson iteration), the second equation, 
(15), can be immediately solved: 

^ ^ [UL + v,{f){l~f)]At 

1 + u.^At ' ^ ' 

Clearly, if we divide D by At, we will have an average velocity, u, that should compare to what 
would have resulted from a more standard Eulerian direct solution, Building on that 

insight, we construct the analogous value of pressure, 

u ^DAt^ mL-Du,) + v,il~ f)] , (18) 
P = [{PL-Dp^)+pj{l-f)] , (19) 

where the pressure jump across the characteristic, 

Pjif) - P*if) - PlU) = -Pl + f{P* - P.)SLAt , 

is the analog to Vj{f), with P* the derivative of the Lagrangian Riemann pressures across the 
cell (the "point" values on the two faces divided by the size of the cell) . 

Consider now the case when U* > and (Ul + Sl) > 0, so that no characteristic reaches into 
the donor zone (Fig. 6). In this (transonic/supersonic) case, "D" is determined trivially from 
the expression for the straight path in space-time, 

-D + {U2^'/''-DuL,,)At = , 

= ^ ^ , (20) 
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where ml,^ = (s^)^' ^^"^ jj^+i/s jg ^^ic time- advanced value of the velocity on the left-side of 
the face. 

In order to calculate the value of p for the impulse in this supersonic case, recall that any 
Lagrangian derivative can be written as an Eulerian derivative: 

dP dp _ dp 

dt dt dx 

The Eulerian value at fixed position at the end of a timestep is given by: 



dx 

Given that p^ — to the order of accuracy we need, and since p" = P" at the beginning of the 
timestep, 

p = = P"+i - Atu^ , 

ox 



= P"+i - D 



dP_ 
dx 



where the second line follows from the definition at equation (18). The average velocity, u, could 
have been obtained by similar reasoning, 

dU _du _du _ w"+i-it" n+i^U 

dt dt dx At dx ' 

Jjn+l 



1 + U^At 

and agrees with that from equation (20). 



4.3.1 Impulse and Work Terms 

As we have seen, the volume of fluid that is moved from the donor to donee zone is based on 
the integral from —D to (for rightward flowing fluid). In slab geometry, this is AfacevAt, but 
in cylindrical and spherical geometry, we can calculate the ± [r^ T {r ~ D)^] ((5=1,2) exactly. 
Using the average values of p and u on the face, we calculate the impulse at the face fi (face 
normal in the i direction): 

pAf^At , 
{mv,)l - If , 
imv,)l + If , 

and similarly, the work done on the face is given by, 

Wf = pAf^ir'^ - {r - D)^) , 

{mE^r+^ = {mE^)l + Wf , 

(the area, A f. , has dimensions of solid angle in spherical geometry, length in cylindrical geometry, 
and area in slab geometry). These changes to mass, momentum and energy are temporarily 
stored as increments in case the anti-cavitation logic requires us to repeat the Riemann step 
with improved sound speeds. The state variables are updated before the next directional sweep 
in the alternating direction explicit (ADX) logic. 



If = 
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4.3.2 Artificial Viscosity 

It has been noted that Godunov solvers in multi-dimensions have to be augmented by some 
form of artificial viscosity [21] or by a more dissipative low-order solver [22]. RAGE too has an 
artificial viscosity, specifically in order to symmetrize numerical errors. 

In particular, we want symmetric (comparable) errors on problems involving shear, inde- 
pendent of the orientation of the shear. Consider a 2-d box with {e.g., 4) alternating bands of 
fluid (aligned with mesh boundaries) moving to the left and right with periodic boundary con- 
ditions {e.g., Kelvin-Helmholtz without the perturbations). Because the velocity field contains 
discontinuities, Strang splitting will no longer be second-order accurate in general; but given our 
direction-split solver, such bands decouple completely, giving zero numerical error in this orien- 
tation. Now consider the flow field rotated by 45°: on each face that straddles the discontinuity, 
the advection logic on each face will calculate an average velocity of zero, resulting in no change 
to those border zones; but the intermediate state Riemann logic will find a non-background pres- 
sure (the two velocities are in opposite directions and limiters turn on) on the face, and that will 
flux some negative momentum into a positive momentum zone (or vice versa) , thereby smearing 
the discontinuity. This smearing is due to the f^'-order (slope limited) Riemann solution being 
Strang-split in this orientation. 

Our approach to such rotational asymmetry is not to attempt to remove the error along the 
diagonal, but rather to add some error along the axes (but in such a way that in smooth flows, 
it will have zero magnitude and thus contribute no error). To that end, RAGE has developed a 
tensor-like artificial viscosity of the form 

^ « O 2/1 IN iPc's)\\L{pCs)\\R 

Qf\\d = /3h max(0, 2cos 6'||d - 1) — — — — - — Av\\d , 

{pc's)\\L + {pc's)\\R 

max{viiL^,vilRj 
Ec=imax(^;2^^,w2^J ' 

= 0.25 , 

for each direction, d ^ f, parallel to a face /. Thus the y and z {d) components of velocity- 
jumps contribute to those components of momentum fluxed across a face normal to the x (/) 
direction. (Note that in smooth flows, Av^^^ because the (unlimited) slopes will now 
construct continuous velocities on each side of the face, turning off the viscosity.) The energy 
fluxed across the face / by these components is calculated by dotting this component of the 
tensor into the velocity parallel to the face, dW±f = AfAt Qf\\d • where 

{pCs)\\LV\\L^ + {pc'JwRVWR^ 
{pc'shL + {pc'.)\\R 

Figure 7 shows an example of the effect of such a viscosity on the vortical flow induced by a 
jet along along the horizontal axis, in this case for the double Mach reflection test problem[21] 
(960 X 240 mesh. Ax = Ay = 1/240, t = 0.2). This same viscosity usually eliminates the 
"carbuncle" growth [22] seen at shock fronts along the axes of a multi-dimensional problem. 

4.3.3 Anti-Cavitation Iteration Logic 

Consider the case that a single, almost empty zone is sandwiched between two high-pressure 
zones - for example, a collapsing bubble. Iterating a standard Riemann solver will ensure that 
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Figure 7: Double Mach reflection test problem, blown up in the vicinity of the Mach stem, 
showing the effect of the tensor artificial viscosity on the vortical flow {Ph — 0.0, left, Ph — 0.25 
(default), right). 

no more than half the zone will fill with material from any face, but on the next cycle - if the 
fluid was relatively incompressible, like water, the density in that former bubble might become 
1.01, in which case the over-pressure would cause the now over-dense zone to explode on the 
next timestep, sending an unphysical shock throughout the fluid. 

To prevent this, each face records the effective (1-d) change of volume, u At Aface, which 
is added and subtracted from the appropriate zone-centered volumes, and a new volume is 
calculated: Vnew = Kw/(^o;d + AV). If the "zone" shrinks sufficiently, updated densities and 
specific energies are used to invoke the equation of state for a new sound speed in those zones 
(and those zones only), and the Riemann solver is re-invoked (for the entire mesh), until the new 
volumes' changes are within some small tolerance. 

Each such iteration requires a new inter-processor communication to pass the sound speed 
information, and the Riemann solver has to re-communicate the intermediate state values on 
the faces back to the zones on either side of the faces (for the /-finder logic). Otherwise, this 
outer iteration loop costs little more than a typical Riemann iteration loop. However, by doing 
the iteration here, we can avoid the over-pressures that occur from causally disconnected faces 
updating their common zone with too much material. 

van Leer [20] found that the weak solution is more than adequate for rarefactions, and 
Colella [23] found that only 1-2 iterations were necessary in a large class of problems he examined. 
In those few cases, we can also handle strong shocks with our beginning of timestep sound speeds 
in the weak-shock Riemann solver, since the few zones where a shock occurs will force one of these 
outer iterations with the shock-modified sound speeds in those zones. (The biggest problem we 
have found occurs when a very strong rarefaction would straddle the face if we allowed it. Then 
our advection/donor logic will make a large first order error, as seen in Fig. 13 on LeBlanc's 
shock tube problem.) 
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4.4 Advection 

When our problems involve more than one material, it is not enough to just integrate the 
linearized fractional mass-densities in order to update the equations {e.g. AAfj — J^^ pjr^dr). 
If we wish to ensure that materials in equilibrium remain in equilibrium, we have to advect a 
certain volume of each material (based on the linearized gradient of volume fraction) , multiplied 
by the partial mass density or partial internal energy density - for the continuity or energy 
equation, respectively. This method works because the partial energies as well as partial volumes 
of materials are returned by the Multi-Material EOS package (which finds the partial volume 
that each material has to occupy in order that the weighted partial densities sum to the total 
density, and the weighted fractional energies sum to the total energy of the zone; see section 5, 
below) . 

Because RAGE is a single- fiuid code, all specific momenta are identical, and nothing fancy 
needs be done to update the momentum equation - it suffices to integrate the bulk momentum 
profile. 

The original RAGE philosophy was to minimize the complexity of the various physics pack- 
ages in the code. In the context of advection, this meant using the same slope-limiter logic to 
advect fractional masses that was used in the hydro to advect momentum and energy; if sharper 
"boundaries" were desired, one should increase the level of adaption, rather than impose a "sub- 
zonal" model of an interface that wasn't part of the equations being solved. In the context of 
an ASCI production code, not all problems run could afi^ord to refine the mesh to the degree re- 
quired - even ASCI computers have finite memories - and the Interface Preserver technique was 
developed to sharpen contact surfaces without further adaption. In a limited number of cases, 
it was found that even this is not sharp enough, and work is on-going at LANL to implement a 
Volume of Fluid method [24] that will confine mixed cells to a single interfacial zone. 

4.4.1 The Interface Preserver (IP) Method 

Having taken into consideration the existing data structures and approach to adaptive mesh 
refinement in RAGE, an efficient algorithm has been chosen to preserve the steep slope or profile 
of a given volume fraction. The approach makes use of the compact stencil currently used in the 
code. Compact means that only those mesh cells directly adjacent to the mesh cell at which a 
slope is being calculated are used in the reconstruction. Early renditions of this technique were 
developed in the late seventies and early eighties by Harten [25] and subsequently modified by 
Yang [26]. Although these methods are commonly termed slope steepeners, they were originally 
called artificial compression methods. The motivation was purely for contact discontinuities. 
Here we have adopted the approach of Yang with modifications to capture material interfaces. 
For ease and clarity, consider volume fraction (j) distributed on a one dimensional Cartesian 
mesh. If the volume fraction (pi is to be reconstructed using the IP method, an initial slope Si is 
always evaluated using minmod limiting (see section 4.2), a standard option currently available 
in RAGE. With the slope so determined, left and right face values can be calculated by 

where (j)f and are the left and right interpolated cell face values of (pi . To modify the initial 
minmod slope, we first determine the difference in the face values of (p on either side of the 
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and the change in slope is then evaluated as 



AS, = 2(33 a,) 



minmod((55 



(21) 



Aa; 



The final, but not necessarily monotone slope, S* , is then 



S* ^ S, + AS, . 



In Eq. (21), is a "discontinuity" or "material detector" that takes on values between and 1 
and is evaluated as 



This "detector" is effective in the following manner. At a material interface there will be a 
variation in volume fraction over a small number of cells giving a value of of order 1 while 
away from material interfaces, such as within a pure material or where material fractions vary 
over a larger number of mesh cells, its value will be near or equal to zero. The idea at this 
point is that this detector will not allow a compressing of materials in regions where material 
has been either intentionally mixed or the interface is no longer resolvable by the current mesh 
spacing. The factor of 33 in front of the discontinuity detector is a somewhat empirical constant 
controlling the onset of the preserver. Notice that if is 1.0, then AS" is rather large (due to this 
factor of 33) and most certainly makes the final slope non-monotone. This aspect requires a final 
adjustment to S* to ensure that it produces no new extrema relative to the surrounding data. 
This operation is always performed at the end of the slope calculation. This final limiting gives 
our final slope Si from S* . Although the presentation here is for the simplest of computational 
grids the actual implementation in RAGE does work with adaptive mesh refinement. Currently 
the IP method is used on volume fractions for all materials, all of the time, with a, being the 
sole "switch" for turning it on or off. 

4.4.2 Implementation and Integration of IP into RAGE 

The implementation into RAGE is more complicated than the above presentation due to the 
need to contend with variable cell sizes, the existence of T-cells involved with the AMR, and 
the use of one and two-dimensional meshes with non-Cartesian symmetry. However relying on 
the coding already in place has allowed for a relatively straight forward implementation of the 
method. 

It turns out that for many applications outside of the area of ideal gases, the IP feeds back 
detrimentally on the hydrodynamics and yields unphysical results. We have implemented a 
technique where the IP is turned off when a shock is passing through a cell, it being preferable 
to drop the hydro variables to truly first order to alleviate some of the problems. Unfortunately, 
it has also been found that turning off the IP even for short amounts of time will allow enough 
material diffusion that the material detector in the IP will no longer see the material interface 
as sharp and never come back on even after the shock has passed. 
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Figure 8: Spherical Adiabatic Compression: Li norm of density as function of position 
for Ax, Ax/2 and Ax/A at times 0.2, 0.5, and 0.9 (the higher resolutions are the 
sharper); analytic values of density for t=0.2, 0.5, 0.9 are also shown at top of figure. 
A first-order error can be seen to advect in from the pseudo-inflow boundary condition. 



Future work will consider additional control switches. The IP is appropriate, in particular, 
for interfaces undergoing pure compression or expansion. However, interfaces where vorticity is 
large indicate an amount of physical mixing in which its use would be inappropriate. Thus an 
additional control switch may need to key off the level of vorticity within a given cell. Also, 
it has been suggested that as soon as material strength is turned off for a material that IP be 
turned off as well. 

4.5 Accuracy and Limitations of the Hydro Algorithm 

RAGE is second-order accurate in space and time when used to calculate "smooth" problems (and 
first-order accurate on discontinuous or shocked problems) . We will demonstrate this by running 
problems on fixed uniform meshes with fixed timesteps, halving and quartering the zoning and 
timesteps in order to ensure known conditions (the adaption algorithm is not designed to be 
self-similar or scale invariant, so refined adapted meshes might not have exactly twice as many 
zones as coarse adapted meshes). 

Godunov [27] has shown that conservation of entropy can be obtained by a linear combination 
of the mass, momentum and energy conservation equations, which means that since RAGE 
conserves mass, momentum and energy, it also conserves entropy. In order to demonstrate 
this, we refer to an adiabatic compression test problem (Fig. 8), where u(r) = — r, (po = 
1, Tq « 0), and show that the Li error of density hovers around machine accuracy independent 
of mesh resolution (until a first-order error propagates in from the outer boundary's pseudo- 
inflow condition). Surprisingly, the internal energy (not shown, not surprisingly) is first-order 
accurate under mesh refinement. 

Figures 9 and 10 show RAGE results on another smooth problem, a "wave in a bucket". 
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where a sinusoidal 10~^ density perturbation is applied to a unit density fluid initially at rest. 
In Figure 9 the zone size and timestep were both halved for each curve; figure 10 shows why 
we usually halve zone size and timestep simultaneously ~ the coefficient of the Ax^ error term 
is much larger than that of the At^ or Aa; At terms - it save us a few runs. Not shown are 
similar results obtained with 2-d waves in a square bucket, which are also second-order accurate 
in space and time - given our Strang splitting and the one dimensional results, this is to be 
expected. Figures 11 and 12 show results on a weak (Sod) shock tube problem, and Figs. 13 
and 14 show results on LeBlanc's strong shock tube problem, both of which are demonstrably 
first-order accurate as expected. 

The order of accuracy is determined by reference to the Li error norms shown in Figs. 8-14. 
If we assume that the theoretical value is given by the computational value plus a truncation 
error, 6 — C{Ax) + aAx^ + • • • , then the ratio of Li errors on two different meshes allows us to 
determine r: \9 - C{Ax)\/\0 - C(Aa;/2)| = 2^ The upper portions of Figs. 11, 12, and 14 show 
the result of calculating r, 

In 161 - C(Aa;)| - In 161 - C(Aa;/2)| 

T = 

ln2 

for adjacent sets of Li errors in the lower parts of the figures (the cycle-0 results have no error, 
and r — 0). In the problems that involve shocks, we have seen that this order of accuracy, or 
convergence ratio, is closer to three quarters than one, for reasons we will suggest presently. The 
same method applied to smooth shockless problems does generate convergence ratios very close 
to 2 {e.g. 1.95-1.98), when applied to the point-wise and mesh integrated results of Figs. 9 and 
10. 

As can be seen in Fig. 10, the Li error is sometimes unchanged when At is refined at a given 
Ax. For this reason, we quote convergence ratios based on runs where both At and Ax are 
simultaneously halved. Then the calculation of r is insensitive to whether the leading error term 
is aAx'', /3Ax'^/2^r/2 or jAf. 

4.5.1 Why Shock tube convergence rates are less than Unity. 

The monotonicity constraints of a finite-difference scheme will smear a sharp shock over 3 or 4 
zones, no matter what the spatial zone size, so that those zones will always contribute about the 
same discrepancy, |Xi — Ci | ^ 0, (X=exact, C=Code, « G (3 — 4 zones) no matter what the zone 
size. If this dominates the error, then on mesh refinement, one expects that the new error will 
be half as much, resulting in a convergence ratio ~ 1. 

However, it does not seem to be commonly known that the typical order of accuracy in 
the treatment of a contact is C'(^^^)[28], where n is the order of the difference scheme. This 
means that a contact in a shockless (otherwise 2nd-order accurate) problem would be 2/3 order 
accurate, while a contact (in an otherwise Ist-order accurate) shock tube problem, would be 
1/2 order accurate. While this doesn't dominate the overall error, it does drag our shock tube 
convergence rates down to approximately 3/4. Similar results are seen for MUSCL and WENO 
schemes involving shocks [29, Table 5, p. 266 and Table 19, p. 277]. 

We have run a further problem, a 1-d Mach 10 shock driven into resting material (based on 
the 2-d double Mach reflection problem of Woodward and Colella, [21]) to show that a shock 
without any contacts really is first order accurate in space and time. The table inside Figure 15 
shows that the Li errors in both density and velocity drop almost exactly by factors of 2 under 
mesh and timestep refinement; the figure itself shows the Li errors in density (multiplied by 
10) compared to the highest resolution's density profile. The same error noted by Woodward 
& Colella, caused by specifying the time-dependent boundary conditions based on an infinitely 
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Figure 9: Linear Wave: Li norm of velocity 
as function of position at time t=0.25 (quar- 
ter period) for the perturbative wave problem. 
Double- humped curve at top is \uanaiytic\', non- 
linear effects begin at 10"''. Spatial convergence 
is 2"'''-order for the 5 Ax's shown, although finest 
is near machine accuracy. 



Figure 10: Linear Wave: Mesh-integrated Li 
norm of density as function of time for per- 
turbative wave problem. Spatial convergence 
is 2"''-order; overlying curves at fixed Ax have 
At, At/2, At/A timesteps. The two somewhat 
"out-lying" (yellow) curves near bottom were 
near the Courant limit (c?x/4,8 « Csdt/1,2). 



sharp shock instead of a numerically broadened one, show up here as two dips in the density 
behind the shock front. ^ 

We present in Figure 16 a calculation of the 2-d double Mach reflection test problem from 
which the 1-d version was abstracted. Without an analytic representation of the entire density 
field, it is not possible to do more than compare the position of various features, all of which 
agree qualitatively with other publications {e.g., our linear reconstruction algorithm's 240 x 60 
and 480 x 120 mesh results should be compared to the MUSCL scheme in [21, Figure 9e, lowest 2 
plots], or to [30, Figure 8 bottom plot. Figure 9, top plot] or [31, Figure 8, top 2 plots], although 
the last is a PPM (piecewise parabolic) method with 4*^* order steepening, and our scheme is 
closer to PLM (piecewise linear method) with 2"'^ order steepening). In this vu-graph norm, one 
can see the sharpening of contours by a factor of two as expected, although the magnitude of 
the error is such that, for example. Woodward and Colella's Ax =1/60 resolution calculation 
looks closer to our Ax — 1/120 resolution. 



4.5.2 Grid Imprinting and Grid Seeding 

It may have been remarked {e.g., see Figure 1) that the square zones of RAGE do not provide 
a perfect representation of circular or spherical objects - the zoning results in a sawtooth repre- 
sentation of what should be a smooth curve (or inclined plane) . There are a number of different 
cases in which this could present a problem, and we briefly discuss them. In the case of smooth 

®One would expect to get similar if not better results with a slab variant of the Noh problem, slamming cold 
fluid into a wall, and the average convergence ratios for the velocity and pressure are indeed 0.96 and 0.95; the 
ratio for the density, however, is only 0.80 — presumably blamable in this case on a wall-heating effect that acts 
like a contact. 
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Figure 11: Log plot of Space-integrated Li norm 
of velocity as function of time for the modified 
Sod shock tube problem (lower 5 curves). Con- 
vergence ratio (upper 4 curves) is close to 1st 
order (0.75). 
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Figure 12: Linear plot of Space-integrated Li 
norm of density as function of time for the mod- 
ified Sod shock tube problem (lower 5 curves). 
Convergence ratio (upper 4 curves) is close to 
1st order (0.75). 



shockless fiow, we have established that the Strang splitting results in a 2nd-order accurate hy- 
dro scheme; thus, while the grid may "seed" some fluctuations in volume-fractions around the 
circumference, the overall motion will be as correct as the hydrodynamics, and Raylcigh- Taylor 
perturbations in 2-d slabs, for example, do grow at the correct rates. In the case of shocked 
flow, there are two cases - a round shell may be converging (as in an ICF capsule implosion), 
in which case no problem arises with a shock going through such a shell and converging at the 
origin; however, in the case that a shock reflects off the origin and re-shocks the shell, the shell 
will exhibit significant instability growth due to the originally grid-seeded perturbations and the 
fact that gradients of pressure and density are oppositely directed (Ray leigh- Taylor unstable). 
This growth can be reduced by resolving the surface - at some cost - with finer zoning. If this 
re-shock occurs within a small number of zones of the origin (i.e., < 30 — 50 zones), then the 
shock itself will not have achieved self-similarity (roundness) after its bounce, and will hit the 
(constant radius) shell at different times at different angles. This can be ameliorated by resolving 
the section of the mesh at the origin as well. In the case that an outbound shock overtakes an 
outbound shell, there is some Richtmyer-Meshkov instability growth, but it is not as severe a 
problem since the gradients of pressure and density now tend to be aligned and this configuration 
is unlikely to be reshocked (or reshocked very strongly) by refiected shocks from larger radii. 



5 Multi-Material Equations of State and Mixed Cells 

In any Eulcrian code, regardless of the degree of adaption, the majority of cells will eventually 
become mixed with more than one material. Given that observation, RAGE has developed a 
multi-material equation of state package (MMEOS) that assumes that all cells are mixed zones. 

Given RAGE's time-explicit hydrodynamic algorithm, sound waves will cross the smallest 
zone in two to three timesteps, relaxing any subzonal pressure differentials in that time. While it 
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Figure 13: Pointwise Li norm of velocity as 
function of position at t=0.6 for LeBlanc's 
strong shock tube problem. Trapezoidal 
curve is the analytic velocity profile. Er- 
rors at the shock front {x — 8) dominate 
this error norm. 

Ll(rho) error for clx=.01, .005, .0025; t=0.2 



dx Ll(rho) Ll(xdt) 

0.01 1.92e-2 1.29e-2 

0.005 9.52e-3 6.46e-3 

0.0025 4.82e-3 3.18e-3 



X [cm] 




- l1d(dv/2) 
■ lld(dv/4) 
-rhQ(dx/4) 



l.OE+OO . 





: i 
. f 

-1 

1 


convergence rates : 


:i 
1 
1 

J 




:i 

:| 


LI error in velocity (dx,dt) 










^ LI error in velocity (dx/1 6, dt/1 6) 



0.00 



time [s] 



Figure 14: Mesh-integrated Li norm of 
velocity as function of time for LeBlanc's 
strong shock tube problem (5 lower curves, 
4 mesh doublings). Convergence ratio (up- 
per 4 curves) asymptotes to ~ 0.85. 




Figure 15: Pointwise Li error in density Figure 16: 30 density contours at t = 0.2 for 2-d 

(xlO) vs. position at t=0.2 for a Mach 10 double Mach reflection test problem, Ax ^ Ay = 

shock driven from the left into po = 1-4,7= 1/60 (top). Ax = Ay = 1/120 (bottom). Cf., [21, 

7/5, po = 1 ideal gas. Gold solid curve is Fig. 9e, bottom 2 plots], [30, Figure 9, top plot], [31, 

calculated density, showing source errors. Fig. 8, top 2 plots]. 
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is not possible to hand- wave temperature equilibrium in as cavalier a fashion, RAGE nonetheless 
makes the assumption that all mixed cells are in pressure and (material) temperature equilibrium. 
This assumption has the advantage that there is a unique solution to the question of the mixed- 
cell temperature and pressure, and has the further advantage of being consistent with the idea 
that if adaption (based on pressure gradients) has stopped in a region of zones, there is nothing 
further to resolve. 



5.0.3 Inverted Pressure-Temperature Tables 



In order to quickly invert mixed-cell equations of state, a RAGE preprocessor inverts the equa- 
tions of state for every pure material before beginning a calculation. RAGE can do this pre- 
processing for (p, T)-based SESAME tables [32] , or a large variety of analytic (either (p, T)- 
or (p, e)-based) equations of state {e.g. Mie - Griineisen[33], JWL[34]), at which point only the 
inverted {P,T) tables need be saved for use in multiple executions of RAGE problems. 

The philosophy of the multi-material equation of state routine (s) is to take these inverted 
tables - Vm{P,T) and em{P,T), where Vm = Pm~^ is the specific volume and e„i the specific 
energy of a "chunk" of material m - and construct the total volume and energy by weighting the 
individual tables by means of the known fractional masses. In addition to the fractional masses, 
Mm, and their sum, M , the volume of a zone, V, and the total energy, E, in a zone are also 
known quantities. 

In outline, the user inputs to the preprocessor a range of temperature and pressure on which 
to build each material's inverted Vm{P, T) and em{P, T) tables. This is done by making Maxwell 
constructions in the original form of the tables, if they are not already there, and searching along 
isotherms (in (p, T)-based tables) to find the desired pressure. The values of 1/p and e are then 
saved. In order for tables to be invertible, it is necessary that various thermodynamic derivatives 
be positive: 
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and these are enforced in the preprocessing step that creates the tables. 

When RAGE calls the equation of state to find a pressure, it knows the total volume V, mass 
M, internal energy Etot, and mass fractions, Xm = ^jf^, within each cell. MMEOS begins an 
iteration process by searching for two adjacent isotherms that bound the specific total internal 
energy of the cell corresponding to the lowest pressure in the table and calculating the fractional 
distance between them that would give the correct total energy, such that 

J2 a;me„,(r,o, Pio) = eio < -j^< eu = ^ .T„e„,(T,„, Pio) , 

m m 

Etot/M - eio 
s = . 

ehi eio 

Using the weighted values of various thermodynamic derivatives, MMEOS also calculates 
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so that this can be used to estimate the next pressure in the table at which to repeat the 
process. In general, one interpolates along an isotherm at the desired pressure to calculate the 
internal energy, until one finds the two bounding isotherms. Although there are various limits 
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designed to ensure that one does not shoot too far through the table too fast, this in broad 
outline describes the method of finding the point at which a mixed cell exists in pressure and 
temperature equilibrium. 

5.0.4 Speed and Accuracy 

The procedure outlined above is quite fast, since the expensive table inversion process occurred 
once during a pre-processing step. Given the inverted tables, the time to look up a mixed equation 
of state compared to a pure equation of state is a linear function of the number of materials, 
where various table properties have to be mass-weighted (such loops would be short-vector or 
superscalar loops). 

Since some other methods {e.g., pairwise balancing of pressures) scale quadratically with 
materials, they would be significantly inferior applied to some of the ASCI-scale problems (e.g., 
50 million zones) that have upwards of 50 materials in the problem. Even so, this linear algorithm, 
on small problems with few materials ( 50k zones, 5 materials) can consume half the execution 
time for a pure hydro problem. 

The accuracy of this process does depend on the models used in the original tables; if some 
SESAME tables have limited domains, then the extrapolations off those tables to construct 
the global inverted table will be problematic: densities on the lowest isobar, for example, are 
overridden so that they will extrapolate to zero at zero pressure for any temperature, while 
energies are reset to the next higher isobar's value so that ~ 0. Similarly, > 

is ensured by requiring that e'(T„_i,P) = min(0.999 • e{Tn,P), e(T„_i,P)), starting at the 
penultimate temperature and working downward along each isobar. 

5.1 MMEOS Limitations 

The MMEOS algorithm was designed in the context of solid mechanics problems, and works 
equally well in the higher-temperature (LTE) regime at Los Alamos.^ This does mean, however, 
that the algorithm is not easily adapted to the "2-T" regime of separate ion and electron temper- 
atures - one would have to construct 3-dimensional arrays, e.g. Vm{Ti,T^,Ptot)i assuming that 
it is the total pressure of each chunk of material that equilibrates. Moreover, one would need 4 
such arrays: Vm, ej^", e^'^ and g'^^'^ = pf}f'^{vm,Te) / Ptot, in order to satisfy all the energy and 
volume constraints, as well as to partition pdV work among the ions and electrons. This is a 
subject of research at present. 

6 Gravity 

Because gravity is a long range force, the large spatial scales characteristic of astrophysical 
systems make necessary its inclusion in many problems. Three different types of gravity have 
been incorporated into the RAGE code: constant acceleration, analytic, and self-gravity. They 
may be used in any combination. The simplest is a constant acceleration, which may be applied 
in any coordinate direction. The magnitude and direction of the acceleration are specified by 
the user. The analytic gravity routines accessible through user inputs are spatially varying but 
constant in time. They include single and binary star potentials. Also included is a power-law 
mass distribution to model the potential due to a star cluster. The gravity routines are written 
to allow users to add other analytic potentials with little effort. 

^By this we mean that if one of the chunks in a mixed zone is in a state "under" its vapor dome, MMEOS is 
going to return a single mean density, and not separate densities and volume fractions for the two phases. 
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6.1 Self-Gravity 

For problems involving self-gravity, the potential and acceleration are determined through a 
global consideration of masses. For a matter distribution characterized by a spatial function of 
density p{x,y,z), the specific gravitational potential (</>) is determined everywhere in space by 
solving Poisson's equation, 

V^(t)^ 4:TTGp{x,y,z) , 

where G is the gravitational constant, G = 6.67259 x 10~^cm'^g~^s^^. The acceleration is in 
turn given by the negative gradient of the potential, g — —Vcj). The equations of hydrodynamics 
are hyperbolic equations and can be solved by a local (explicit) method, but Poisson's equation 
is an elliptic equation which must be solved by a global (implicit) method. For this reason, the 
calculation of gravitational energies and accelerations typically requires as much computing time 
as all other hydrodynamic processes combined. In order to model self-gravitating systems with 
sufficient spatial resolution within reasonable run times, a fast gravity solver is desired. 

While many methods for solving Poisson's equation exist, the adaptive grid algorithm used 
by RAGE restricts the possibilities. The chosen method is the multipole method of Salmon, 
Warren, & Winckelmans [35]. The SWW method was developed for N-body particle codes, but 
the hierarchical tree structure employed is also compatible with the unstructured mesh of RAGE. 
At any moment in time, the discrete integral representation of the Poisson equation is written 
as 



Xn 



q 

This is a direct sum over q cells, each of mass m^, in order to determine the potential at a chosen 
location x. To solve equation (22) on a grid containing N cells requires 0{N'^) operations. The 
multipole method is designed to lower this number to approximately 0{N log N). 

The key to the multipole method lies in combining distant cells into groups and then solving 
for the potential due to the group rather than each individual cell. As SWW point out, this is 
analogous to treating the Earth as one extended object rather than a collection of individual 
atoms when calculating the gravitational force beyond its surface. The right hand side of equation 
(22) is approximated by a multipole expansion for a group of q cells within the region 5*: 



1 Qij{x Xcm^i(^X ^cm) 



2 \\ X Xr/m. 1 1 



In this expression, Ms is the total mass contained in S, Xcm the center of mass of S, and Qij the 
quadrupolc moment of S about its center of mass. The dipole term of the expansion vanishes 
when taken about the center of mass. The quadrupolc moment Qij, is determined by 

Qij ^ ^ [3('^g Xcm)i{Xq Xcnijj ^iji.'^q Xcm) ' (-^g '^cm)] ; (^4) 

qes 

where Sij is the Kronecker delta function, and repeated indices indicate a summation. A multi- 
pole acceptability criterion (MAC) is required in order to quantify the error introduced through 
the use of equation (23). The multipole expansion requires the use of data on all cells, but RAGE 
typically computes data on leaf cells only. (A leaf cell is one that is not sub-divided into daughter 
cells.) So before the gravity solver can begin, the mass, center of mass, and quadrupolc moment 
for all mother cells must be determined through the aggregation of data from the daughter cells. 
Once the needed data are in place, cell-centered values of </> and ~W(f> are determined for each 
leaf cell with a treewalk through the grid hierarchy. The treewalk starts with a loop over the cells 
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on the coarsest grid level. If the cell in question is a leaf cell, its contribution to the potential 
is computed from equation (22). If the cell is a mother cell, its MAC is determined. Should the 
MAC prove to be satisfied, equation (23) is used to calculate the contribution to the potential. 
If the approximation error exceeds the allowable value, the source cell is divided into its daugh- 
ters and the test is repeated. This continues until either the MAC is satisfied or only leaf cells 
remain. Note that in the case that AMR is not used, this procedure becomes the direct method 
embodied by equation (22). 

In the manner of SWW, we consider the MAC to be satisfied should the MAC error be lower 
than a user specified limit e: 



d? (1 



' '(f + 2)fii-(p+l)fi|)<e, (25) 



In this expression, p represents the highest order retained in the multipole expansion, d =|| 
X — Xcm II, and h ~ maxqgg || Xq — Xcm II- The moments _B„ are defined as 

Bn = ^\\ Xq- Xc„i ir mq . (26) 

qes 

Embodied in the MAC is the notion that the multipole approximation is more accurate if the 
distance to the measurement point is large (large d), the sources are scattered over a small region 
(small b), higher order approximations are used (large p), and the multipoles which have been 
neglected are small (small Bp+i). For the RAGE implementation, the quadrupole approximation 
is used, so p = 2. 

As expressed by equation (25), the MAC is an absolute criterion. The error tolerance must 
be set according to the smallest contribution of V0. However, applying the most stringent error 
tolerance everywhere on the computational grid will result in a higher accuracy than needed and 
a significant slow down of the code. The solution to this dilemma is to turn the MAC into a 
relative error criterion. This is accomplished by noting that the expression in equation (25) has 
the dimensions of a mass divided by distance squared. Therefore, a dimensionless relative MAC 
is defined by dividing the left hand side of equation (25) by Mg/d^ to get (for p = 2), 

^- p - 3§ ) < mac_tol . (27) 

This new relative MAC (named mac_tol in RAGE) represents the relative size of the error in 
using the quadratic approximation instead of the direct sum. Typically a value of mac_tol=.01 
proves sufficient. 

It is useful to recast the relative MAC into a form in which it becomes a property of the 
source cell [36]. The MAC can then be calculated for all mother cells prior to the treewalk. A 
critical radius Tc is defined by solving equation (27) for d. This problem is simplified by noting 
that by the definition of the moments in equation (26), i?4 is always greater than zero. Therefore, 
that term can be dropped from equation (27) without violating the inequality. The equation can 
then be re-written to produce a cubic expression for d: 

AT) 

d^ - 26d2 + 52^ _ ? > 0. (28) 

Ms mac.tol ^ ' 

When treated as an equation, this expression can be solved analytically. The result is the critical 
radius r^, defined as the closest distance to the source for which the quadrupole approximation 
meets the criterion set by mac_tol. Note that the computational simplicity gained by dropping 
the i?4 term comes at the price of a critical radius which is larger than is strictly necessary. 
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6.2 Integration of Gravity into RAGE 



Once the gravitational potential and acceleration have been determined, the results must be 
incorporated into RAGE. The conservation equations for mass, momentum, and energy take the 
modified form: 

, (29) 

-pV0 , (30) 

-pvL ■ , (31) 

where (j) is the gravitational potential, and V0 = g the gravitational acceleration. 

Adding gravity into a Godunov code is a multi-step process. During each direction-sweep 
through a hydro cycle, gravity affects the dynamics both before and after the Riemann solver. 
Before being passed to the Riemann solver, the gravitational acceleration is combined with the 
acceleration due to the pressure gradient to form half-time values of and uj^. After the 
Riemann solution is formed and the state variables are advected, the energy and momentum are 
then modified by the changes in gravitational potential and acceleration such that 

AE = AE +<j) Am , (32) 
A(mu) = A(mu) +m gAt , (33) 

for a cell of mass m with a change in mass due to advection of Am. 
6.3 Limitations 

A significant decrease in the run time of the self-gravity solver can be achieved by taking advan- 
tage of the manner in which RAGE indexes cells among different grid levels. Setting up the best 
combination of grid size, AMR levels, and gravity input parameters requires some vigilance on 
the part of the user. 

As of this writing, the RAGE self-gravity solver operates only for three-dimensional prob- 
lems with non-periodic boundary conditions. Two-dimensional solutions and periodic boundary 
conditions are under development. 

7 Radiation-Matter Energy Exchange in RAGE 

Many authors[37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 
58, 59, 60] have addressed the topic of handling the coupling of energy and radiation at high 
temperatures. Our approach has three interesting features: 1) It produces an exponential relax- 
ation of the difference between these two, which is more accurate in general. We explain this in 
detail and compare it to alternative treatments. 2) It can handle arbitrary variation of Cy and 
"Planck" opacity within a timcstep, and 3) it uses a novel technique to solve our resultant set 
of nonlinear equations. 

7.1 Nomenclature 

The primary dependent variables in RAGE are E, p, E„i, and u, where E represents the radiation 
energy density, p the mass density, pE„i the material total energy density, and u the material 



dp 

dt 
g(pu) 

dt 
djpE) 

dt 



+V • (pu) 
+V • (puu + p) 
-V • {pEu + pu) 
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velocity. The material specific internal energy is defined by 

1 9 

e = Eyn- 2" • 

The temperature 9, pressure p, and specific heat Cv are given in terms of e and p by an (inverted) 
equation of state, which may be tabular or analytic. The radiation coupling — pKats and mean 
free path A — \/ pntot are also given by opacity tables or analytic functions. 
The quantities and Tr will appear; they are given by 

$ = a6'\ (34) 

where a is the radiation constant. Tr is called the radiation temperature, but we do not mean to 
imply by this that the radiation has any particular spectrum; Tr is just a measure of radiation 
energy density. 



7.2 Energy Equations: General 

Although RAGE has options that enable more ambitious physics treatments, here we assume 
local thermodynamic equilibrium, with electrons and ions each described locally by the same 
temperature 9. 

RAGE is a multifrequency code, but this article will treat radiation in the gray diffusion 
approximation, for which only its energy density E is of concern. Its spectrum is of interest only 
insomuch as it affects the calculation of p, and A; it is not calculated by the equations described 
herein. The spectrum may be the result of a simultaneous multifrequency-gray calculation (as 
described by Winslow and others[53, 61, 62, 63]) of which these equations form the gray pass. 
Or it may be assumed to be that of a Planckian at temperature 9, or even at temperature Tr. 
The treatment described herein applies to all of these cases, with small suitable modifications. 

RAGE is not a relativistic code, but it does consider terms in u/c to first order [53, 54]. All 
the quantities involved with radiation are taken to be evaluated in the rest frame of whatever 
fluid element they reside in at the moment, the co-moving frame. To 0{u/c), there is a simple 
transformation between the radiant fields in a comoving frame and those of an inertial frame [53]. 
Finally, by making the Pi approximation (/ = i^{cE + 3ri • F) or equivalently, Pr — E/3), the 
hierarchy of moments of the transport equation close at the momentum equation, so that the 
relativistically correct equations (to 0{u/c)) become 

F u 9F 
,,iE-i)-n---2-,--. (35) 



dE 
~dt 
1 dF 



V • {[E+p,]vL + F 
F 



7.2.1 Gray Radiation Hydrodynamics 

We now make the diffusion approximation, ^ = 0, relieving us of the burden of solving the 
radiation momentum equation, (36), and dropping the last term from the radiation energy 
equation, (35), above. We further assume that velocity-dependent terms do not contribute to 
the steady-state flux so that we can maintain F oc Wpr- The resulting radiation diffusion 
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equations for the energy E and flux F are 



n 771 

+ V-F = -c/i(S-$) 

at 

-V • (Eu) 

-V • (p,u) - u • ^ , (37) 
cA 

F = ~CD\7E , (38) 
where the diffusion coefficient D is given by 



D 



Ac 

y 



and C denotes a flux hmiter, a common device [55, 56, 57, 58, 64] invoked to Umit the diffusion 
flux so that it does not exceed the streaming Hmit, cE. In RAGE this limiter, if active, depends 
on the dimensionless quantity X\V{E)\/E, among other things. RAGE uses the Levermore- 
Pomraning flux limiter [64]; others have been tried at various times (by us and others [57]) and 
not found to make much difference. 

The material momentum equation, to 0{u/c), is 

£(pu) + V-[puu + p] = ^ , (39) 

and the material energy equation, to the same order, is 

1 



pe+ -pu^ +p ) u 



V • = cix{E-<^) + ^n-¥ , (40) 

Ac 



with the material heat flux F^, given by 

F™ = -xV0 . 

The material conductivity, x, can be read from SESAME tables or calculated from simple analytic 
expressions. 

7.2.2 Split Equations 

At each timestep RAGE splits the time integration into three separately performed phases: 
hydro, material heat conduction, and radiation. 

The various relativistic terms are treated differently in RAGE. The advection term, line 2 of 
Eq. (37), is calculated explicitly in the hydro phase and the updated result becomes the initial 
value E_ , for the radiation phase. The energy and momentum changes wrought by the radiation 
pressure and the flux, line 3, are explicitly treated at the beginning of the radiation phase. Line 
1 is of course solved implicitly as the final step of the radiation phase. Since u and p are regarded 
as time-independent functions of position during the radiation phase, the matter equation during 
this part of the operator split simplifies, leading to the following coupled equations, 

-y ■ C{E)D{t;E)\JE{t) ^ ~cp{t){E{t) - ^{t)) , (41) 



dt 
dejt) 



-cpit)iEit)-m), (42) 
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where material properties (D, fj,, Cy, etc.) depend on t because of their dependence on 9, which 
itself depends on the time dependent internal energy e. Of course these properties depend on p 
too; we shall not bother to indicate this. 

Adding Eqs. (41) and (42) gives the basic diffusion equation to be solved in this phase: 

^-^^±^~V-CDVE^O. (43) 

7.2.3 Special Case: Equilibrium Diffusion 

If the coupling coefficient /i is very large, in a sense which we will make precise below, Eq. (42) 
becomes stiff and implies 

E{t) - $(t) = . (44) 
From the definition of <f>, Eq. (34), we have 

9{t) = VmJ^ , (45) 
and the equation of state, £, provides the material specific energy e, and specific heat Cy: 

eit) ^£ie{t)), 
'^v{0) = w\p ■ 
We see that Eq. (43) requires d{pe)/dt, which we can evaluate as 

d{pe) 

g-f. ~ de d$ dt ' 

where the last line follows from the result of the large p, Eq. (44) (remember that p is fixed in 
this phase). Inserting this into Eq. (43) produces the equation of equilibrium diffusion 

(1 + pCv /4a6'3) - V • CDVE ^ . (47) 

(This is often expressed as an equation for 9: (pCy + 4a6''^)^ — VAaCD9^V9, and may be 
referred to as 1-T radiation conduction.) In general, (47) is an implicit and nonlinear equation. 
(The exception is the unphysical case in which D is constant and Cy is proportional to 9^^ 
introduced by Pomraning [65] specifically for the purpose of generating a simple test problem 
and so used by Su and Olson [59, 66], and others [58, 60, 67].) It is implicit because the material 
properties Cy and D are related to 9 through the equation of state and opacity function, and the 
flux limiter C (if active) is explicitly nonlinear. It is nonlinear in addition because 9^ appears, 

and is a function of E, not just a parameter: 9 = \p^- 

7.2.4 Special Case: Linear Diffusion 

If p is very small, the matter and radiation decouple, Eq. (41) becomes 

dE{t) 



dt 



V • CDVE{t) = , (48) 



and the material temperature never changes. Unless the flux limiter or other phenomenon {e.g., 
two-temperature opacities, see appendix B) causes CD to vary with E, this is a linear equation 
and can be solved by standard methods. 
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7.3 Time Differencing 



We now return to the case of non-equilibrium diffusion. One of the unique features of RAGE's 
treatment is the asymmetric temporal treatment of the radiation energy equation and the mat- 
ter energy equation. As we will see, the radiation energy (diffusion) equation is handled with a 
typical backward difference approach, but the matter equation is rearranged, treated as an ordi- 
nary differential equation (ODE), and integrated exactly, leading to what we term exponential 
differencing. 



7.3.1 Solving for E Using Backward Differencing 

At every timestep RAGE integrates the diffusion equation from a starting time i_ to a finishing 
time an interval At. ft assumes that E{t) throughout the timestep can be approximated by 
its final value, E^. This is the backward difference approximation, only first-order accurate in 
time but stable and very robust. Thus integration of Eq. (43) over a timestep results in 

{E+ + pe+) - (S_ + pe_) - V • CD\7E+ At = , (49) 

where CD is 



f 



t+ 



CD= — CDdt , 
Jt- 

and the evaluation of CD is left ambiguous at this point. This is not a closed system, because we 
do not yet have a value for e^, but with the E variation now fixed the matter energy equation, 
(42), becomes 

p^ = c^,{t){E+^m) , 



or equivalently. 



pCv^^cpit){E+-^t)) . (50) 



Integration of Eq. (50) will enable us to calculate e+ needed in (49), effectively allowing us to 
solve those two equations simultaneously. 



7.3.2 Solving for e+ Using Backward Differencing 

A very popular approach to finding e+ is to use a backward difference scheme for the matter 
equation also. This approach is used by Freeman [68], Knoll et al. [69], Shestakov et al. [58], 
and many others. It leads to a replacement of (50) by 

pCviO+ - 6i_) = cpiE+ - $+) At . (51) 

Eqs. (49) and (51) form a closed system once CD and JI are specified. It is implicit and non-lineax 
for all the same reasons as the equilibrium diffusion system, Eq. (47), and in addition involves 
the implicit (via 9 and the opacity function) and (likely) nonlinear variation of p. 



7.3.3 Solving for e+ Using Exponential Differencing 

We choose not to use the backward difference scheme on the material energy equation, (50), be- 
cause it can result in predicting noticeable temperature differences between matter and radiation 
in cases where the coupling is large enough that these should be in, or close to, equilibrium, as 
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we shall see below. We are able to avoid this problem by treating Eq. (50) as an ODE - possible 
because the coordinate x is there only as a parameter; no spatial derivatives appear. This will 
let us find the actual time dependence of $ (and 9 and the variables which depend on it) during 
the timestep interval At, allowing a more accurate evaluation of the energy transfered during 
the timestep. 

If we define a variable with the dimension of time, 

(52) 



then Eq. (50) implies 

d<i>(t) _ E+- $(t) 
dt ~ r($(t)) 



(53) 



From this we see that r is the scale time for radiation-matter coupling (strictly speaking, t is the 
relaxation time for $ and E coupling in a gradient-free medium) and it is the proper measure 
by which we can quantify our earlier statements about /i being large, leading to Eq. (44) or fj. 
being small, leading to Eq. (48) {i.e., t small compared to the time of interest or vice versa). 

Viewing t as the dependent variable and as a simple fixed parameter we can rearrange 
Eq. (53): 

dt = ri^)^. (54) 

This implies that as the interval time increases monotonically from zero toward infinity, the 
difference between and <!> relaxes from its initial value, {E+ — <&_), towards 0. r, which 
is positive and finite on physical grounds, affects only the rate at which the equilibrium is 
approached. Equation (54) has the formal solution, 

At^t,-t_^l^_ rm^^, (55) 

and inversion of this equation gives <i>_|_ as a function of E^ . This in turn gives the temperature 
0+ at the end of the timestep and, via the equation of state, e+. Thus we have in principle: 

e+ = eiE+) . (56) 

7.4 Inverting At(<l>+) 

The method RAGE uses to invert Eq. (55) is the subject of the next few sections. We begin 
with an illuminating special case. 



7.4.1 An Ideal Case 

It is very common that /i (for ionized materials) varies approximately as the inverse cube of the 
temperature 9 while Cy varies slowly by comparison. Figure 17 shows, at left, a log-log plot of 
the Planck absorption opacity ^ of aluminum at one tenth normal density, p = 0.27 g/cm^, as 
calculated by the TOPS opacity code at LANL [70]; at right is a plot of the specific heat Cy 
of the same material, calculated using a Saha-based equation of state [71]. These plots suggest, 
at least in this case, a) that a 1/9^ approximation is roughly correct and b) Cy is constant by 
comparison. 
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Planck Opacity of Al, density 0.27 CV of Al, density 0.27 




9 (eV) 9 (eV) 

Figure 17: Left: Planck opacity of Al at p = 0.27 g/cm'^. The heavy lines illustrate 9^^ behavior. 
Right: Specific heat, Cv, for Al at p = 0.27 g/cm"^. 
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Let us idealize this behavior, for the sake of an example, to the case where /x varies as and 
Cv is constant. Then r is also constant, and the integration and inversion of Eq. (54) becomes 
trivial: 

^+-E+^ e-^'/^ ($_ - E+) , (57) 

explicitly displaying the above-mentioned relaxation toward E^. (The same favorable circum- 
stance of constant r occurs for gray Pomeranium, a (mythical) element with a material specific 
heat Cv oc 6''^, but with a constant opacity [65].) 

The fully backward differenced matter equation, (51), with the same assumptions, would 
replace Eq. (53) with 

di> _ - E+ 
lit ~ T ' 

giving the solution. 

If we recall that the equilibrium diffusion model, Eq. (47), gives 

^+-E+ = , (59) 

we see that all three schemes' solution can be written as 

$+ = + (1 - a)E+ , (60) 

with a — 0, a — Y+St/r^ or a — e^'^*/'^, for equilibrium diffusion, fully backward differencing, 
or exponential differencing, respectively. 

To repeat, the decay toward equilibrium in our scheme proceeds as exp (— Ai/r) whereas the 
backward scheme goes like 1/(1 -I- At/r). This means that, although both agree to first order 
in At/r, the exponential scheme always transfers more energy between matter and radiation. 
Although it does not insist on equilibrium, it gets very close as At/r gets large. We expect our 
method to do better for intermediate timesteps where At ^ t. Figure 18 shows these differing 
solutions over a range of a few time constants; Appendix A discusses the application of this 
technique to the case when the opacity has a power-law behavior other than /i ^ 0~^, and we 
discuss arbitrary opacity behavior in the next section. 

7.4.2 (f> in the General Case 

As can be seen from our example opacity plot. Fig. 17 (left), the inverse cube behavior of jj, 
is common but not a law, nor, as seen on the right, is Cy always a constant. For aluminum 
in the temperature range 10-30 eV, both actually increase with 6, and r turns out to vary 
approximately as 9~^^^. 

rage's strategy for handling such cases consists of a more careful treatment of Eq. (55) . It is 
invoked if we find that the constant-r approximation fails, as given by this criterion: evaluate r at 
^>_, use it with E- in Eq. (57) to find <&+ and hence 9+ (and thence Cy and ^) and reevaluate 
T. If the two values of t are substantially different, then the timestep At is broken into M 
subdivisions At*^™^ small enough that r is reasonably constant over each one. Equation (55) is 
evaluated piece by piece with t time lagged in the integral on each subinterval, 

At(™) = /'^+-*'"' d{E+~^') 
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Figure 18: Three models for a 



so that we can do the integration and inversion for that subinterval: 



(62) 
(63) 

Expanding <I)(™~i) in terms of c[)(™^2), efc, then gives 

= _^ (^Qf(")(l - a(™-i)) + (1 - a^™)) , 

= (l-a(i)---a("))i;+ , (65) 

which lets us find the new time constant t*^™); 

r(") =t($('")) , (66) 

and we repeat until At is reached. We emphasize that At^^™^ is chosen first, then <I)(™) is found 
from it. Defining an unscripted a, 



a = • . «(3) 



and using Eq. (65) we have, as before, 

<i>+ = a<i>_ + (1 - a)E+ 



(67) 
(68) 
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giving the desired <I>_|_ in terms of known quantities, even though a is no longer a simple expo- 
nential. 

This subcycling need be done only for those cells that are undergoing large material property 
changes in a given timestep, for instance when the opacity does not approximate the common 
inverse cube law or the specific heat increases suddenly as the result of beginning to burn out 
an ionization stage. In practice this is a rare event. The overall cost in time and load balance 
is small anyway compared to the cost of a diffusion solve, at least in 2-D or 3-D, since inversion 
of the diffusion operator is not involved in this procedure. We also reap the benefit of treating 
non-ideal cells accurately without slaving the timestep of the entire mesh to the small values 
required by a few stiff zones. For example, a non-subcycled backward difference code might 
require a timestep based on a relative temperature change of 5-10%, while we can allow an order 
of magnitude change without adverse effect (if this is the only stiff effect). 

Notice that we have separated the problem of handling material properties that have any, 
possibly tricky, variation with temperature from the problem of dealing with the so-called ratio 
of specific heats. This latter problem involves only the treatment of the behavior of 0^ , simpler 
in principle than dealing with the vagaries of temperature variation of opacity or specific heat, 
and is with us even in the case of equilibrium diffusion, Eq. (47). We treat it in section 7.7.1. 

Note that a is always between and 1. For the simple case of constant r this is obvious 
because a is an exponential of a negative real (a = e~^*^^). For the more complicated case 
of general r, a is a product of M factors, each of which is bounded by and 1. We can test 
the accuracy of the approximation that r is a constant (over the subdivision) by comparing 
successive values of r*-™-*, and adapting the subcycle timesteps appropriately (e.g., given the 
logarithms of the ratios of the two t's and the two 0's, we can parameterize t as a power law in 
temperature (r ^ 9^). Requiring ^</^pf</^|f = fc£J(i„eAt('")/r<"-)) < j 

Using Newton-Raphson, one can solve for At(™). ) 
7.4.3 A Practical Illustration of the Iterated <I> 

A valuable side effect of this approach is that it has the capability to handle cases in which the 
opacity has a local maximum, and this approach can actually provide the correct solution even if 
some zones start the timestep below the opacity peak peak and end above the peak: one merely 
marches through the difficult region using Eqs. (61)-(66). An example is given by the gray 
opacity of oxygen at p = 1.3 • 10"** g/civ? , as it responds to an 80 eV radiation front. Figure 19 
shows that the opacity increases by almost 10^ between 1 and 10 eV before it drops back over 
the next decade of temperature (SESAME's EOS 5010 has Cy constant over this interval). 

Without subcycling, one has to hope that t is constant for $ to be correct, and when /i is 
not proportional to , this means limiting relative changes in r to a small value, say 10%. 
However, as Fig. 19 shows, for temperatures between 1 and 10 eV, ^ ~ 6'+'^, so r ^ 9~^, and a 
10% change in r requires no more than a 1.66% relative change in temperature. Figure 20 shows 
a non-equilibrium Marshak wave in relatively low-density oxygen (idealized from the gas-filled 
hohlraum that raised this issue). The oxygen sits in a (relatively) high temperature radiation 
bath until enough ionization occurs to allow it to rapidly heat up. The solid black {9{x)) and 
solid red (Tii(x)) curves shows a converged calculation, run with 2000 timesteps to a final time 
of 20 picoseconds. The various dashed curves show profiles of Tb{x) and 9{x) at i = 20 ps, 
when RAGE was altered to not do any opacity subcycling but to run on a timestep control that 
kept the relative temperature change below 20, 10, 5 and 2% (requiring 122, 254, 554 and 1493 
timesteps, respectively). Only the last calculation has converged the matter temperature. 

When RAGE has adapted an interface in order to get the correct energy deposition into a 
wall, say between air and gold in a hohlraum, it has to adapt in the other direction(s) as well. 
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Figure 19: Absorption coefficient /iaf)s(^), for oxygen (/?= 1.310 g/cm'^). 
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Figure 20: Material and radiation temperature figure 21: Material and radiation temper- 
profiles in oxygen at 20 ps. No subcycling of ^^^^^^ profiles m oxygen at 20 ps. The 
opacity occurs. Dashed curves used timestep ^oli^^ ^lack curves subcycled opacity, the 
controlled by Ae/9 = 20, 10, 5, 2%. The solid long-dashed curves didn't; both used At = 
curves used 2000 timesteps (At = 10"" s) to ^ • 10 s. The heavy-dotted curves used 
converge. " ^■ 



For the case depicted by Fig. 20, RAGE has very small zones in the direction that the Marshak 
wave is propagating, even though it does not need them in that direction, and would prefer to 
spend as little time as possible calculating the flow in that direction. Contextually, "little time" 
means flowing radiation on a radiation Courant timestep. At « Ax/c, taking 1 or 2 timesteps to 
cross each zone. Since the level 1 mesh size is 0.01 cm in this example, this means At « 3 • 10"^"^ 
s, requiring 67 timesteps to get to the displayed time of 20 ps. If one used traditional timestep 
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controls, allowing 2% relative temperature changes from 1 eV (when opacity begins to change) to 
40 eV, one would require 186 timesteps to cross each zone. That becomes prohibitive given the 
size of 2D and 3D AMR matrices and the time required to invert them. Figure 21 compares the 
results with and without the opacity subcycling when the timestep is forced to be the radiative 
Courant timestep, and both are compared to the 2000 timestep converged result. The width of 
the leading edge of the radiation temperature reflects the 0{At) accuracy of the (flux- limited) 
diffusion solution; it can only be improved with smaller timesteps, and even cAt/Ax = 1/2 will 
be much better. If RAGE had not been able to subcycle and thereby dispense with traditional 
timestep controls, it would not have been able to run this type of problem. 



7.4.4 d$+/d£'+ 

Recalling that the equation we need to solve involves d{pe)/dt, and using the chain rule, we can 
now write 



dt 



_ ae de aE _ pCy d'S> be 

~ de dE dt ~ 4afl3 dE dt 



As we will see presently, our solution scheme will require the time-advanced d^-^-/dE+. For 1/6^ 
opacities and constant speciflc heat, r is constant, a = e^'^*/'^, and d^+/dE+ results trivially 
from Eq. (60): 

(69) 



dEl 



In the general case we still have Eq. (68), but since a depends upon E^, we must compute the 
derivative in a more roundabout way. By taking the derivative with respect to Ej^ of Eq. (55) 
and doing a little manipulation, we have 



dE, T, 



Again we break the integral into smaller pieces. The added cost is small because we can 
recycle the partial results from Eqs. (61)-(66), building 



r(0)(l_«(l)) .^(2) .^(3)...^(M) 



dE+ ^+ ' 

+^(i)(l_a(2)).a(3)...a(M) 
+T(2)(l_a(3)).a(4)...aW 

H 

+^(M-l)(i_„(Af))] ^ 
,M-1 



and calculate this at the same time we calculate a. By working out the telescoping sums of the 
weights, we see that = 1 — a, and rescaling to create normalized weights w™ that sum 

to unity, 

EE - a) 

= (1 - «('"+!)) • • • • • a(^'^' /(I - a) , 
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we can define an average (t) 



enabling us to rewrite Eq. (71) as 



A/-1 







with a given by Eq. (67). This means that in the general case we have the result for a constant 
T, Eq. (69), multiplied by the ratio of the average of r over the timestep to its value at the end 
of the timestep. In all but the most perverse cases this ratio will be of order unity (RAGE does 
not depend on this supposition, using Eq. (71) directly). 



7.5 The Equations for £'+ and e+. 

Equation (68) tells us that can always be written as + (1 — a)E^, regardless of the 
behavior of fj, and Cy. What about e+, which is required to solve for the radiation energy at 
the new time, as seen in Eq. (49)? 

We use exactly the same logic as in the case of equihbrium diffusion, described above in 
section (7.2.3). The equation of state £{d) gives 

e+^£(^^/¥^) , (72) 

where <i>+ has been used to find as per Eq. (34). Hence we have e+ as a function of 



7.5.1 The Equations to be Solved. 

Finally, we specify CD as evaluated at the forward time. Our equations then become the non- 
linear coupled set, for which we invert a matrix equation for E^, 

E+ + pe+-\/ ■ {CD)+\/E+ At = E^ + pe_ , (73) 

and then solve for the remainder, 

= + (1 - a)E+ , (74) 

e+ = V^^, (75) 

e+ - £{0+) , (76) 

where Eq. (73) represents the diffusion equation (49) and Eqs. (74) . . . (76) result from the 
exponential treatment of the radiation-matter heat exchange, Eq. (50). This set is to be solved 
at every timestep for every space point, thus advancing E and e from time <_ to time Their 
subordinate variables can then also be advanced. 



7.6 Spatial Differencing 

All cells in RAGE have rectilinear cross sections with unit aspect ratio in every direction, and 
all faces are orthogonal to the appropriate coordinate axes. Therefore only one component 
of a gradient will contribute to a divergence on any face in any RAGE geometry, and spatial 
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differencing becomes quite simple. For example, if an interface normal h points in the x direction, 
then for any scalar s: 

Vs • n ^ — , 
ox 

and this 0{Ax) expression for the face gradient suffices to build 0{Ax^) zone-centered diver- 
gences. 



7.6.1 Touching Cells 

There are two possible configurations of the geometry of touching cells: either both cells are the 
same size or else one cell is half the size of the other (a situation we refer to as T-cells, given 
the form of the interface ([ h ], as in Fig. 22). In ID, adaption does not affect the size of the 
interface, but in 2D or 3D the larger cell communicates through two or four interfaces with the 
two or four smaller cells. 



7.6.2 Differencing for Same Size Interface Cells 

Same size interface cells in RAGE have the same size in the dimension(s) perpendicular to the 
line joining cell centers. In this sense all ID cells qualify, as do all other geometries except across 
interfaces where cell adaption has occurred. 

rage's difference stencil is simple. For every interface not subject to a boundary condition, 
we require that the flux calculated between the cell centroid and the interface on one side be 
equal to the flux between the interface and cell centroid on the other side, and that they be 
equal to a net flux between the two centroids: Fl = F^j = F, where 



As 



3 Atl 



Fb- ~Dn^^ ="3 Am 



^/aceA^^^+As^H - 3 ^^^^ ' ^"1 



Aszf and Asfz are the distances between the appropriate cell centroid (Z = i or R) and 
the interface / (half the zone size along Cartesian coordinates), and the t are optical depths 
{D = Ac/3 ^ At = As/A). Solving these equations gives the face quantities: 

AtlEr + AtrEl 
^f-^^ = At^ + Atr ' ^^^> 

_ c AsLf + AsRf 
^^'-'^ ~ 3 Ari + Arfl ' ^^^^ 

implying that optical depths add: Atj^r — Ar^ -I- Atr. 



7.6.3 Temperatures on a Face 

Experience has shown that if the two diffusion coefficients (-Dl, F)r) are evaluated at the zone- 
center temperatures (Ql, Or), it is next to impossible to use Eq. (77) to calculate properly the 
self-similar solution of a Marshak [72] wave. The cold material has a very small Dz (very large 
Arzf), hence Atrl is also large and thus very little radiation flows - according to the difference 
equations. The solution to this problem has been to find a common face temperature, use it 
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to define DL{9face) and Dji{9face), and then compute the Ar^, At^ and Ar^fi from those 
D's. Even though the temperature(s) are now the same, these D's will difi^er if the densities or 
constituents of the two cells difi'er. 

Flux equality requires that E vary linearly in optical depth, Eq. (78), and we note that if 
radiation and matter were in thermal equilibrium (E = $) then 6'^ would vary similarly. We 
adopt this ansatz in all cases, using a two step approximation: First we calculate D's using 
the current cell-center temperatures. This lets us define optical depths and so find 0^^^^^ by 
interpolation: 



We then use this value of Oface for the "real" lookup of opacities, D's, and Ar's. Some alternate 
ways of estimating 6 face are discussed in Appendix B; we have not found them to make enough 
of a difference to offer users a choice in the matter. 



7.6.4 Differencing for Unequal Size Interface Cells 

In the case where adaption has occurred, as depicted in Fig. 22, we "splat" (extend central values 
across the bigger cells) and continue to use Eq. (77), for F03 and Fqi. This leads to a formal loss 
of spatial accuracy, but is mitigated by the fact that in our AMR implementation the gradients 
are relatively small at any place in the mesh where such scale size changes occur. Splatting has 
the advantage that the resulting matrix couples only cells which share an interface. 




Figure 22: RAGE Difference Star for Non-conforming Cells. 



Edwards [73] introduced a fix for this loss of accuracy, in the context of petroleum extraction, 
that he called the "Flux Continuous Approximation" (FCA) . Lipkinov et al. and Olson et al. [74, 
75], re-derived it in the context of support-operators (such that div and grad are consistent) for 
diffusion. The argument is summarized briefiy in Appendix C. 

FCA was coded for RAGE in 1996 for 2D geometries. Unfortunately in 3D FCA requires 
a coupling between cells that touch only at an edge. This was not allowed by RAGE's data 
structure at that time. In order that the 3D code reproduce 2D results where expected, RAGE 
chose not to use this technique in 2D either. 

This is not such a blemish on the code as might be expected from the resulting formal loss of 
accuracy. For one thing, RAGE's AMR logic places such scale changes only in regions with small 
gradients (gradients are the trigger for adaption, after all). In the case of isotropic coefficients and 
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a 2:1 split in cell sizes - both always the case in RAGE - simple splatting, observed Edwards [73] , 
"can produce extremely good results" . 

In any event, recent changes to the data structure should now allow such couplings, and we 
expect the FCA treatment to appear in the next major revision of RAGE. 

7.6.5 Boundary Conditions 

RAGE users have the option of four different types of boundary conditions for radiation: Neu- 
man, Dirichlet, and two forms of Robin. 

The default condition is to assume a Neuman (mirror) condition, VE — 0, at all edges of the 
mesh. (This is reasonable for heat transfer mediated by ions and electrons that one does not 
expect to leave the mesh, but less so for radiation, since one generally does expect photons to 
leave the problem if they get to the boundary.) 

All other boundary conditions are specified by nonzero values of temperatures on the appro- 
priate face. The user can choose to impose either a Dirichlet or a Robin boundary condition 
on all boundaries with non-zero temperatures {e.g., in 3D, one could have all six boundaries at 
six different temperatures, as long as they all use the same condition). The Dirichlet condition 
holds the temperature to a fixed value, Tf at face /; of the two types of Robin condition, the 
Marshak/Milne holds it at that temperature but 2/3 of an optical depth beyond the face, and the 
Spillman variant [76] of Marshak/Milne holds it to that temperature at a variable optical depth 
beyond the face. The Marshak/Milne boundary condition is designed to reproduce the effect for 
a single energy equation that the transport equation would accomplish with two equations; it 
requires optically thin boundary zones to succeed. Spillman's variant is conceptually similar to, 
but considerably simpler to implement than, the Levermore-Pomraning boundary condition [64] 
that enabled their flux-limited diffusion to recover transport solutions in thin and thick limits. 
We compare and contrast the Robin conditions in Appendix D. 

7.7 Solution Techniques for Non-Linear Systems 

As mentioned above, Eqs. (73). . . (76) are very similar to the result of a backward difference 
scheme, so it is of interest to examine techniques for solving that scheme. Shestakov et al. [58] 
discuss the use of pseudo-transient continuation to stabilize a Picard-Newton series of lineariza- 
tions and iterations, and Knoll et al. [69] discuss wrapping a Newton-Krylov iteration around an 
inner iteration. We might well profit from the use of such methods, but at present RAGE uses 
a different (nameless) method, also involving linearization and iteration. 

7.7.1 RAGE'S Technique 

At each timestep we execute an iteration process, with iterates {-E"^', e^|f''} (n = 1, 2, 3, • • • ) for 
the time-advanced solution of Eqs. (73). . . (76). Each iteration consists of three conceptually 
distinct steps. 

The first step consists of making an initial estimate e^^^j} of the n*'^ iterate based 

on the previous iterate {E^ ^\ ^'}. Currently we simply take 

where in particular the 0*'' iterate consists of the initial values at the beginning of the timestep: 

{E^:\ef} = {E^,e^}. 
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We identify this as a separate step in the iteration process because more elaborate methods 
for this step are currently under study, having the flavor of the prediction step in a predictor- 
corrector technique. 

The second step consists of solving a linearization of the radiation diffusion equation, in 
which the coefficients are evaluated in terms of the estimated values |-E^gs(, e^gsj| from the 
first step. More precisely, Eq. (73), 

is linearized as 

<L + PeSL„ - V . {CD)^:l,S7Ei%^At ^ E^ + pe., (81) 

where the coefficient (£_D)^'gjj is evaluated at the values {£^+^Lt7 ^+^Li} ^ith 9'^['^g^ given in 
terms of e^il'g^j by the equation of state e^g^j = £ (^^^Lt) ' where e^']-,^ is given by 



J") 



The RHS of Eq. (82) is in fact the first two terms of a Taylor series expansion of e+ as a 

function of Ej^ about the linearization point E^^\^f , with e+ and = slijr^§^SE7 given 
by Eqs. (74-76), reproduced here: 

= + (1 - a)E^^^ , (83) 



= \/<^^f/a , (84) 



(O^f) , (85) 



where a and d^/dE are evaluated as described in sections 7.4.2 and 7.4.4. Having solved Eq. (81) 
for E^l-^, we compute e^)-„ by substituting E^^]-^ into Eq. (82). We note that the solution to 

the second step, {i?^;j„, e^]-„}, conserves energy exactly, subject only to the accuracy of the 
equation of state inversion and of the hnear solver used to solve the system Eq. (81). 

Early versions of RAGE used a conjugate gradient linear solver with a point Jacobi (diag- 
onal scaling) preconditioner. We have now converted to an algebraic multigrid preconditioner 
(LAMG) developed at LANL by Joubert [77] ; this preconditioner has typically given, for moder- 
ate to large 2D and 3D meshes, a factor of five or better in speed-up of the linear solver, relative 
to the point Jacobi preconditioner. 

Owing to the linearization in the second step, the values {E^]^^, e^]j„} will not, in general, 
simultaneously solve the material energy equation, Eq. (85). Hence we do a third (corrective) 
step to bring the solution into agreement with the material energy equation for each cell, without, 
however, changing the total energy in each cell from the value obtained in the second step. This 
step consists of solving, for each cell, the energy balance equation 



for {-B^ % eV^} in which the RHS is the total cell energy from the second step, and on the 



LHS is given self-consistently as a function of E^^ by Eqs. (83-85). 
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This completes the specification of a single iteration, giving the n*'' iterate {E^''' ,e^"'} in 

terms of the (n — 1)*'' iterate {E^ ^\ ^■'}. In order to decide when to terminate the iterative 
process, we (currently) look at the total energy in the cell, exiting when changes in that quantity 
becomes sufficiently small. We are also investigating the more usual method of looking at the 
nonlinear residual. Normally, one or two iterations suffice. 

Our experience has been that, for small to moderate timesteps, the dominant nonlinearities 
come from the radiative heating and cooling associated with the material energy equation. For 
large timesteps, the nonlinearity associated with the diffusion coefficient and flux limiter can 
become more important. In fact, for large enough timesteps At, we may encounter diffusion 
fronts that move faster than Ax/ At, where Ax is the size of a computational cell; in this case it 
is necessary to execute several iterations in order for the front to traverse the required number 
of cells. 

The great bulk of the cpu time (at least for 2D and 3D meshes) is normally spent in executing 
the linear solve in the second step of the iteration (and possibly in the first step in future versions 
of the code). No other computations require inter-cell communication; they execute relatively 
quickly, particularly in a parallel environment. 

The current version of the code makes the simplifying approximation that the specific heat 
Cy is held fixed during the radiation diffusion computations. This simplification eliminates 
a number of equation of state calls, replacing Eq. (85) with the simple linear relation e_|_ = 
e_ + Cv (^+ — 0-)- Future versions of the code will include as an option the elimination of this 
simplification. 

7.8 Comparisons to Other Codes and Standard Problems 

Figure 23 shows a comparison of the results of a test problem taken from Shestakov et al. 
[58]. The RAGE results for <i> {aO'^) at 20 ns are shown as histograms and the original calculation 
using pseudo-transient continuation (^tc) as a curve through cell centers. As one can see, the 
temperatures (or <I>'s) calculated by the two codes (with the same zoning and timesteps) are in 
very close agreement. For this problem RAGE was modified to use the same hydrogenic Saha 
equation of state used by Shestakov et al. rather than SESAME tables, in part because the 
SESAME EOS includes the effect of dissociation of molecular H2, not just ionization of atomic 
hydrogen (as specified by the test problem). RAGE was also modified to use the n—1 Larsen 
flux limiter [57] (RAGE's Levermore limiter agreed with the n ~ 2 Larsen limiter, and both 
disagreed with the n = \ limiter by a little more than the thickness of the lines in Fig. 23) . 

Figure 24 shows the temporal behavior of a related problem. The configuration is changed 
to an infinite medium with fixed Cv but the same jjLa oc T^^/'^ opacity, so we can focus on the 
coupling of matter and radiation only (i.e. without diffusion complications). The plot shows 
the electron temperature as a function of time, up to 20 ns. The problem has been run three 
times, varying the integration timestep At. The smoothest curve was generated by using 200 
timesteps, each of size 10"^'^ s. Another curve used only 20 timesteps of 10~^ s, and finally 
the solid curve used only two(!) timesteps of 10~* s. Clearly one loses information as to what 
happens in the intervals between such monster timesteps, but we call attention to the fact that 
rage's exponential difference scheme gets the right answer at a given time regardless of the 
timestep - as long as opacity subcycling is allowed. 

Figure 25 shows temperature profiles for a self-similar problem due to Reinicke and Meyer- 
ter-Vehn [78] and popularized by Shestakov [79, his figure 4, "E0=235"]. A thermal wave 
and a hydrodynamic shock are created whose radii maintain a fixed ratio, generated by a point 
explosion in cold material with p ^ ^-i9/9 xhe solid black curve of the figure shows the resultant 
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Figure 23: Comparison of aT'^ spatial profiles at 20 ns from RAGE (thin black histogram) and 
\l/tc [58] (smooth blue curve). 



temperature profile using the heat conduction package in RAGE (with the appropriate analytic 
conductivity, k = 31.62 • p~^T^-^); the red dot-dashed curve shows the radiation temperature 
Trad and the blue dashed curve the material temperature Tmat calculated by RAGE's radiation 
diffusion package, all at a time of 5.15 • 10"^" s. T^at and Trad do not overlay the conduction 
solver's temperature profile because the self-similar problem was defined with a constant Cy [78]; 
unlike the conduction package, the radiation diffusion package cannot help including an extra 
AaO^ in the total specific heat, a significant component at these temperatures that destroys the 
self-similarity of the solution. The more interesting point is that RAGE's exponential differencing 
has kept the two temperatures in equilibrium (at an early time of 1 • 10^^" s, there is a slight 
disequilibrium near the origin); we expect standard backward differencing methods would have 
trouble keeping the temperatures clamped together (recall Fig. 18). We did, however, assume 
the freedom to define a Planck- weighted material absorption opacity Kats = 5.433 • 10^^ pT~^-^, 
31.32 times larger than the presumed Rosseland total opacity Ktot = 1.735 • lO^-'^pT"^-^, making 
relaxation times r ~ 10~^^ — 10^^^ s. Without this, the two temperatures were out of equilibrium 
everywhere, and the thermal front was another half millimeter retarded (r ~ 10^^^ — 10^^° s). 

Finally, we present an example of the ability of RAGE to deliver reasonable results even 
under unreasonable conditions. In particular, we display solutions of a Marshak-like problem, 
with a left boundary held at 1000 eV. Cell 1 is initialized to that same temperature, but the 
rest of the (ID) grid is started at room temperature. We use a dummy material with a constant 
specific heat, Cy — 10^^ erg/g/eV, constant density, p = I g/cm'^, and variable opacity, 



Each cell is 0.02 cm wide. We run to 10 s, at which time a simple constant-flux-approximation 
argument suggests that a wave would have burned through about 6^ cells. RAGE normally 
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Figure 24: 9(t) temperature relaxation in an infinite medium's 27.2 eV radiation bath. 
Cv is constant, /ia ~ T~^/^, and r is not constant. 

Reinicke-AAeyer-ter-Vehn problem #2, t = 5.15e-10 s 



T[eV] 




distance [cm] 



Figure 25: Temperature profiles when thermal wave has reached r — 0.9 {t — 5.15 10^^° 
s). Black solid: heat conduction only; blue dashed: material temperature, red dot- 
dashed: radiation temperature from (2-T) radiation diffusion. 
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chooses its timesteps fairly carefully, beginning with a very small one and inching up as seems 
appropriate, but for this series we forced it to use a constant timestep. We ran the problem 
three times, taking one, two, and fifty timesteps « 15,7, 1/3) to get to the final time; the 
results are shown in Fig. 26 (radiation temperature vs. space) and Fig. 27 (electron temperature 
vs. space). 

We do not claim that the results obtained with the ridiculously large timestep are accurate 
(after all, the backward difference scheme used for the diffusion equation is only first-order 
accurate), rather that they are not unreasonable; in particular, since we have no theorem that 
proves our iteration method will converge, these results are encouraging in that they do not 
diverge. Moreover, they show that when multiple (expensive) physics packages are turned on 
{e.g. the gravitational Poisson solver), we can iterate the matrix-solves in the radiation package 
without requiring the rest of the code to run on the smaller effective timestep. 

7.9 Radiation Summary 

We have described just a subset of RAGE's radiation capability: gray diffusion. RAGE also 
contains a multifrequency diffusion package, in which the spectrum of the radiation energy field 
is calculated by the multifrequency-gray method [61], and of which this algorithm is a subset, 
and a gray Pi package which contains a number of novel features. The gray diffusion algorithm 
described here is of note primarily because of the exponential differencing of the material energy 
equation and of the novelties in the iteration scheme. Exponential differencing is particularly 
useful in a class of problems in which radiation floods a region of space and serves to heat a 
contained body. This situation occurs frequently in inertial confinement systems, where the 
radiation source is produced by e-beam or laser deposition in a hohlraum wall and the goal is to 
heat and compress a capsule containing the nuclear fuel. 

8 Verification and Validation for SAGE and RAGE 

During the course of the ASCI (now ASC) Project, substantial time has been devoted at LANL, 
either directly or indirectly, to the verification and validation (V&V) of the Project codes. Direct 
V&V results from a Project team member running specific simulations to quantitatively compare 
to analytic results (verification) or to results from experimental data (validation). Indirect 
validation results from the use of the Project codes by end-users for their own purposes, either 
in designing a new AGEX (Above Ground Experiment) experiment, in the analysis of a previous 
experiment, or merely as a tool for analysis of a physical problem. 

8.1 Verification 

There exists a large number of verification test problems in our community, including the Se- 
dov exploding blast wave [80, pp. 393-396], the Guderley converging shock,[81], [82, p. 793], 
anthologies of hydrodynamic [83] and radiation- hydrodynamic problems [84] , both LTE and non- 
LTE [72, 85, 66], Marshak radiation wave problems, and many more (non-LTE in this context 
means that Tr ^ T^). A detailed report on the suite of verification problems is being prepared 
(see e.g. [86, 87]), as well as work on automating such ongoing testing [88]. 

One example of such a verification problem is the Scdov problem, with unit density and zero 
initial velocity but 4.936 • 10^^ erg in a delta function at the origin (i.e. in one zone at the origin, 
so that r shock = 1 cm at t—lQ ns). The results from the (2d) cylindrical Sedov problem are 
shown in Figure 28, comparing the SAGE adaptive mesh hydrodynamics to the analytic result, 
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Figure 26: Radiation temperature profiles for a non-equilibrium Marshak wave using 
one, two, and fifty timesteps (i.e., 15, 18, and 122 matrix solves). 




Figure 27: Matter temperature profiles for a non-equilibrium Marshak wave using one, 
two, and fifty timesteps (z.e., 15, 18, and 122 matrix solves). 
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Figure 28: 2D (rz) Spherical Sedov problem. Density is plotted vertically and colored by 
density; the analytic solution {pmax = 4) is reflected in the left quadrant for comparison to 
the calculation in the right quadrant (numerically diffused pmax ^ 3.5). 

and some error norms are provided in table 1, which incidentally shows that the same accuracy 
can be achieved in less time and fewer zones using adaption. The ratio of Li(p,u,p) errors for 
the fixed meshes is consistent with those errors being proportional to typical for 

most shock problems we have looked at (shockless problems without discontinuities do behave 
like Ax^). 



level 


Li{p) 


Li{p) 


Li{u) 


^ cycles 


zones 


cpu-time[s] 


cc/s/pe 


1 


0.13 


3.99el3 


1.80e6 


289 


900 


13.2 


1.97e4 


2 


0.0869 


2.59el3 


1.23e6 


1016 


3600 


139.3 


2.63e4 


3 


0.0555 


1.67el3 


7.87e5 


3406 


14400 


2020.3 


2.43e4 


4 


0.0328 


1.00el3 


4.74e5 


10867 


57600 


24512. 


2.55c4 


Al-4 


0.0327 


1.00el3 


4.72e5 


8052 


30921 


8268.7 


1.90e4 



Table 1: Comparisons of Li error norms for density, pressure and velocity on 4 fixed size meshes 
as well as a mesh that adapts ("Al-4") to the finest level. Also shown are measures of the time 
required to calculate the answer (on a Mac G5). 

In general we are very pleased with the success of the SAGE and RAGE algorithms in 
comparing to analytic and semi-analytic results (see e.g., [89] for an example of comparisons 
to semi-analytic problems and Section 4.5 for four comparisons to analytic solutions). Many of 
these verification problems are included in a regression test suite (run on the LANL machines) 
as part of the software quality assurance process. 

8.2 Validation 

Our first validation effort, [90], "The Simulation of Shock Generated Instabilities" compared 
detailed code simulations of shock tube experiments to the experimental results in a quantitative 
manner; the results from this ("gas curtain") work are summarized (in the "vu-graph" norm) 
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Figure 29: Comparison of experimental and calculated density profiles for a gas curtain 
shock-tube experiment, t — profiles provided initial conditions. 

in figure 29. In order to agree with the data at t ^ 450 ms, the code was initialized with 
the density profile measured at t = ms (after background subtraction and some smoothing). 
Further Richtmyer-Meshkov validation was done using shocks generated by lasers [91] and pulsed- 
power [92]. RAGE has also been used to model more general laser-induced shock effects [93, 94, 
95] and more recently, 3-D oceanic disturbances generated by asteroid impacts and tsunamis [96, 
97, 98, 99]. 

Code validation also formed the basis for a Ph.D. thesis for a SUNY Stony Brook student [100, 
101]. In this work, code simulations were done for a similar shock tube environment but in 
this case the Mach 1.2 shock interacted with a dense cylinder of gas, instead of a gas curtain. 
Both the code and the experimental techniques had changed by the time of this comparison 
(the code was rewritten from Fortran?? with a Lagrange plus Remap hydro to Fortran90 with 
the Direct Eulerian hydro described in this paper; the original experiments [90] used Rayleigh 
scattering as a diagnostic while the new ones used "fog" tracer particles, which were thought 
to be equivalent to Rayleigh images and were essential in order to perform particle velocity 
measurements) and the agreement between the two was not as good as it was in the earlier 
work. Having determined that changing the hydro algorithm didn't change the discrepancy 
with experiment, changes to the initial conditions were tried - it was hypothesized that the SFg 
diffused faster than the fog particles before the first image was taken, leading to incorrect initial 
density profiles - and much better agreement was obtained. The top part of figure 31 shows 
how much the initial RAGE density profile had to be diffused in order to match the (late-time) 
data (shown in figure 32), compared to the original profile based on the "fog" tracer particles 
(left half, figure 30). This prompted the experimenters to re-image their initial profiles using 
a Rayleigh scattering technique (right half, figure 30). The lower part of figure 31 shows how 
rage's diffused density profile (which gave good late-time agreement) compared to a (later) 
Planar Laser Rayleigh Scattering (PLRS) measurement.^ In this case, one could say that the 

*The agreement between the profile used to model one experiment and the PLRS profile from another shot 
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code validated the experimental technique. On the other hand, the advantage of the fog tracer 
particles is that they enable the experimenters to measure the velocity field in the gas cylinder 
(from two closely spaced images). When the calculated and simulated images of the density 
agreed (with the difi^used density profiles), the velocity data were also found to agree, to with 
10-15 percent, and one can say that the experiment validated the code in this case. 

Another example of validation is the work being done as a part of a multi-lab, multinational, 
collaboration on the simulation of supersonic jets and shock interactions [102], where detailed 
quantitative inter-comparisons have been done among several computational tools (including 
RAGE) and the actual experimental data obtained from a series of AGEX ("Above Ground 
Experiments") experiments. All the code techniques, including CAMR RAGE and ALE codes 
from both Livermore and AWE (the UK Atomic Weapons Establishment), show good agreement 
with the data. In this and other cases [103], results of code simulations are used to design as 
well as predict future experiments. 

These examples touch on the breadth of validation efforts completed and underway in the on- 
going effort to understand the range of capability for the CAMR Eulerian-based hydrodynamics 
and radiation energy flow as coded in the massively-parallel, modular version of the RAGE 
code. More detailed documentation efforts are underway. Two of the areas of active research 
in the Crestone Project currently are the treatment of interfaces between materials and the 
improvement of the material strength package. 

9 Summary 

We have presented a description of the SAIC/LANL Radiation Adaptive Grid Eulerian code, 
RAGE, describing some of its data structures, its parallelization strategy and performance, its 
hydrodynamic algorithm(s), and its (gray) radiation diffusion algorithm. We have also shown 
that all packages and the code as a whole are subject to a considerable amount of verification 
and validation efforts. 

The Godunov-based hydrodynamics uses a second-order alternating-direction explicit (ADX) 
method, and this algorithm determines the maximum timestep the code can take. The iterative 
nature of the anti-cavitation logic handles stiffnesses in the equation of state in an implicit 
manner, meaning that we do not have to control the timestep based on, for example, relative 
changes in density or mass fraction. 

The gray radiation package uses a temporally first-order and spatially second-order backward 
Eulcr method to solve the diffusion equation. The material energy equation is integrated ex- 
actly for opacities proportional to T~^, and to any specified accuracy for all other behaviors by 
subcycling the numerical integration. This solution is then linearized to build the coupled total 
energy diffusion equation, and a subsequent Newton-Raphson step rebalances the material and 
radiation energy to correct for any linearization errors. Finally, an outer iteration is performed 
until the total radiation plus material energy in each zone is relatively unchanged between two 
iterations. 

Because the radiation package has taken great pains to treat (1) stiff material coupling terms 
(/iabs or Cy) by subcycling, and (2) linearization errors by Newton-Raphson fixups, it can be 
considered to be maximally implicit. This means that its accuracy is completely determined by 
the relative changes in radiation energy, independent of the changes in material temperature. 

RAGE by default attempts to take relatively large timesteps, based on controlling just the 
relative change in total energy over the entire timestep, 5{pEm + E)/{pEm + E), to some 
tolerance, e.g., sie_pct = 0.2. It is also possible to control the material temperature change on 

shows how reproducible the experimental conditions were. 
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Figure 30: Experimental images of a gas cylinder, using "fog" tracer particles on the left, and 
Rayleigh-scattering on the right. 
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Figure 31: Top: original (measured) and "fitted" (dif- 
fused) initial density profiles for a gas cylinder experi- 
ment; the original was an (angularly) smoothed mea- 
sured fog distribution. Bottom: comparison of that 
"fitted" profile to the measured Rayleigh-scattering 
profile {n.b., scales have changed). 
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Figure 32: Experimental images of a gas 
cylinder, using "fog" tracer particles on the 
left; calculated images using sharp (middle) 
and diffuse (right) profiles from Fig. 31 top. 



each step of the operator splitting to a tolerance (tev_pct), but this is a legacy of Lagrangian 
technology. We intend to update the splitting logic to instead control relative changes of material 
and radiation energy on each step of the operator split, and may couple this with an overall 
control on the non-linear residual. 

The implementation of these implicit treatments allow our AMR code to run on timesteps 
that are considerably larger than would be allowed by the traditional small-percentage-changes 
in material and radiation temperature on each step of an operator split, and this means that 
our AMR code can run more highly-resolved problems than might otherwise have been possible, 
within time and resource constraints. 
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A A Less Ideal Opacity 



The case of constant r is but one example of a whole class of situations in which exponential 
differencing (55) can be treated analytically: that for which r has a power-law dependence on 
0, and thus Given that 



r = X 



(86) 



Eq. (55) becomes 



At 



At 

A 



0+ 



d(t> 



1 ' 



^ 0^+Sfi(A+l,l;A + 2; 



A + 1 



(87) 



where any represents the corresponding <I> scaled to E-^^: 

2Fi(A + l,l;A + 2;(/i) is of course the confluent hypergeometric function of the second kind [104]. 

This is a very nice form: since <&_, A, and are all known, the left side of Eq. (87) is 
just a constant multiple of At (also known). The right side, on the other hand, is a function of 
both E and This then provides all we need to find <&+ as a function of E; if need be we can 
resort to a numerical root-finding routine to extract the information. 

This chore can be rather intimidating in general, but is much easier when A is a multiple 
of 1/4, because then the hypergeometric function can be expressed in terms of standard simple 
functions. 



For example, consider the case A 
opacity. Then since 



as occurs with a constant specific heat and 1/6 



2F^{l+l,l-l + 2-4,)\i_ 



1/2 



tanh ^ 
71 



Eq. (87) becomes: 



Using this we can explicitly display in terms of a simple parameter, X, 

X - r ^ 




where X is 



X — exp 



X + l 



At 
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Note that we used Eq. (86), the scaling law for t, to evaluate te- 

Earlier we discussed three different simple models for the parameter a, which governs the 
energy transferred between matter and radiation during a timestep, Eq. (60). It is of interest 
to examine a for this new case: 



1-0+ 



AX 



(1+^) 



Figure 33 shows the results for three different values of a parameter heretofore irrelevant. 
The bottom curve, for = 0, is the same as the curve labeled "exponential" in Fig. 18. Again 
this more sophisticated treatment yields seemingly more reasonable results. 



X=-l/2 




Figure 33: A fourth model for a 



B Alternative Face Temperature Calculations 

Instead of assuming that 0"* varies linearly with optical depth across an interface, 

another choice would be to make the minimalist assumption that the zone-centered temper- 
atures vary linearly between zones and interpolate a material temperature on the face, 

unless the problem is in radiative equilibrium, in which case we might expect E and 9^ to vary 
linearly in space, leading to 
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If one knows that if the problem is in a completely ionized regime, so that the opacity is 
dominated by free-free (Bremsstrahlung) physics, the mono-frequency opacity goes like Te ^^"^ jv^ . 
When this opacity is averaged over frequency using as a weight the Rosseland at the radiation 
temperature (since we are talking about gray diffusion, the only questions are whether the flux is 
a Planckian or Rosseland, and whether it is at or T^; we make the latter choices), we end up 
with a "two-temperature" opacity with mean free path A '-^ tI^^T^ (a two-temperature opacity 
is the result of using a non-equilibrium Planckian or Rosseland radiation field to calculate a 
gray average: e.g., K{0e,Tr) ~ /p K,^(9e)Bi,(Tr)dv). Ignoring the weak dependence on material 
(electron) temperature, we can write this fiux as ^ T^VE ^ E^^'^VE. If we ignore quarter 
powers of things, we can say that the flux behaves approximately as cx Vi?^, whose flnite 
difference approximation would be 

F (X V£;2 

\ 2 J Axl+Axr ■ 

(For electron conduction, the Spitzer conductivity varies as 6'f/^ leading to - Ol^'^VOe ~ 
9lV9e. Since VOl/3 — ^{Oj^ + OlOr + 6j^)V9e, we have found that the intermediate tempera- 
ture. Of ace = \/ {0\ + OlOr + Oj^) /3, in Spitzer's formula gives good results on this variant of a 
Marshak wave.) From this point of view, one would argue that the face temperature ought to 
be related to the average of the two energy densities without any linear interpolation (in space 
or optical depth) : 

- (^) . (91) 

It has been our experience that whether one uses the assumption of linearity in optical depth 
space for 9'^ (Eq. 88), or either of the linear in space methods (Eqs. 89, 90), all Marshak-wave 
types of problems can be well modeled. Although we have not looked at the last averaging 
method, Eq. (91), in any detail, it ought to give results similar to the linear-in-space average 
of 9"^ in most cases, and might be the best approach to use with two-temperature opacities in 
general. 



C Improved Difference Scheme for T- Cells 

When adjacent zones are equal-sized, as for the case of vertical flow between zones "1" and "3" 
in Fig. 34, the assumption of piecewise continuity of E along with the requirement of continuity 
of flux from zone 1 to the (13) face and from that face to zone 3 gave the two equations that 
defined Eface as the optical-depth weighted average of the zonal E's and Tface as the sum of 
the zonal optical depths. (In principle, i<^"|°"'^ ^ Eif""^, but piecewise continuity means that 
/ dWE around a tiny volume straddling the (13) interface as dV 0, and that forces 

j^above rpbelow \ 

■^13 ~ ^13 ~ J^face-) 

For T-cells (reverting to the horizontal direction of fiow in Fig. 34), piecewise continuity of E 
means that as / dWE = ^ dAfEf 0, the three non-coincident values of E at the respective 
faces of the zones are related by: Aq ■ Ea^^^^ — A3 • Ecf,,^^ + Ai ■ Eb/^^^ , where Aq is the area 
of the large zone's face at the T-cell interface, and Ai and A3 are the areas of the interface as 
seen by zones 1 and 3 respectively. 

In 2-D Cartesian geometry, the Ai — Ao/2; in 3-D, they are a Ao/4. In 2-D cylindrical 
geometry, for face-normals in the f direction, the Ai are also Aq/2, but for normals in the z 



61 



Figure 34: Edwards' difference stencil. 



direction, they vary because dA = rdr, and r varies (the smaller radius face varies from ^Aq at 
the origin to ^Aq approaching infinity). Let us define = Ai/Ao. 

By requiring continuity of the fluxes at T-cells and using the relation Ea = cliEb + a^Ec, 
we have the following two constraints: 



Fl = - 
Fl = - 



cEa-Eq 
3 Ato 
cEa-Eq 
3 Ato 



3 At3 
c Ei~Eb 
3 Ari 



Once one has solved for the uninteresting Eb and Ec, one can calculate the common face flux: 
c aiEi + asE^-EQ _ c ai{Ei - Eq) + as^E^ - Eq) 



Eface 



3 aiAri + agArg + Atq 3 ai(ATi + Atq) + aaiArs + Atq) 



Note that this flux could be regarded as the weighted average of the naive fluxes, F^"'™'^ = 
— I Xt'+Atq ' "^ith the weights ai(Ari + Atq). On the other hand, we can define an optical depth 
for the faces, 

Tface = Ato + oiAti + asAra = ai(ATo + Ati) + a3(ATo + Ats) , 
in which case the fluxes can be written as the area-weighted average of less-nawe fluxes: 
c aiEi + a-^E^ - Eg 



face 



= ai 



C (El - Eg) 
3 Tface 



0.3 



C (^3 -~ Eq) 
3 Tface 



3 Tface 

In 3D, we assume zone abuts 4 smaller zones, f , 2, 3 and 4. In that case, 

Tface = Atq + OiAti + a2AT2 + a^AT^ + GiAT^ , 

= ai(Aro + Ati) + aaCAro + Atj) + OsIAto + Arg) + a^ATo + At^) , 
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and the mean flux is similarly given by 



face — 



c aiEi + 02-^2 + a^Ez + a^Ei - Eq 
3 



+ai 

+03 



c {El - Eo) 



3 Tface 

c {E3 - Eo) 



3 Tfa 



+ 02 



04 



C {E2 - Eq) 
3 Tface 
C {Ej - Eq) 
3 Tface 



One can define a simple test problem, with a linear temperature distribution {e.g., 1000 eV 
at a; = 10 to 2000 eV at x = 20 independent of y). In Cartesian geometry, the Lqo error in 
temperature (assuming both Cy and k are constant in the material heat conduction equation) is 
approximately half of one percent and behaves like 0{Ax^) on refinement. Including these face 
effects in r and V • F, the error can be reduced to machine accuracy (10~^^), even without 
modifying the matrix (nor does including the modifications in the matrix change the perfect 
result). 

In cyhndrical geometry, the uncorrected Loo error is again about a half percent and consistent 
with 0{Ax^). It is reduced to about a part per thousand with the full (r, V • F, and matrix) 
correction, now consistent with 0{Ax'^), since the steady state solution has T cx ln(r). In this 
case, correcting the face t and V • F without correcting the matrix generates Lqo errors that are 
comparable to the temperature {i.e., greater than 100 percent). 

For this reason, it was and is better to do nothing than try to do a partial fix, and until 
we are certain that our current implementation runs as well on multiple processors as on one 
processor, we will continue to do nothing. 



D Spillman's variant of the Marshak/ Milne Boundary Con- 
dition 

When constructing divergences of fluxes, the elemental building block is the contribution of each 
face to the divergence, and this quantity, 

is used to calculate divergences of the flux for the matrix source term, edits, etc. At boundaries, 
one of the t's obviously fails, leaving only the other, Atz- In that case, the Dirichlet condition 
sets the contribution to 

{F-dS)face = -'-^-{E^-aT^J, 

Ov{Y.dS)face = -"-^-{aT^ce-EL), 

depending on the boundary. Henceforth, we will concentrate on the Left boundary and zone R. 
The Marshak/Milne condition sets the contribution to 

(F • dS)face = • - «^/ace) • 
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The Spillman variant [76] is an attempt to cover more bases. Assume that the user specifies a 
(dimensional) vacuum boundary length, Xy (default — 1 cm), and recall that Xz was used to 
calculate the optical depth: Atz — Aszf/Xz- Defining the ratio, 

^ 2Xv + 3Xz 
^ Xv + 2Xz ' 

Spillman sets the contribution to 

SO that when Xz is large compared to both Ay and Ax, C | the denominator goes to 
I + Atz- When Xz is small compared to Ay but still large compared to Ax, ^ — > 2 and the 
denominator goes to | + Atz, the standard Milne result. Even if the user makes a poor choice 
of Ay, in optically thin zones, Spillman will always give Milne-ish behavior. 

However, if the user intended to have infiow via a Milne condition, but made the zone abutting 
the boundary so large that its optical depth was enormous, Milne's 2/3 + At is indistinguishable 
from Dirichlet's t, and the fiux 0. In this case the numerics allows essentially no fiow across 
the boundary, contrary to what would have been allowed if the zone had been thinner, and what 
physics presumably always requires. 

In such a case of large optical depth (Aszf 3> Xz), Spillman's exponential term vanishes 
and the denominator varies between unity and | (depending on the exact value of Ay relative 
to Xz)- The numerical fore-factor (~ | • • • 1) will now be within a factor of two of the optically 
thin Milne result (|), instead of vanishing. These contributions from boundary faces do not 
contribute to any off-diagonal elements of the matrix to be solved, but the factor multiplying 
{E — aT"*) does add (actually, subtract) from the diagonal element of the matrix for the zone 
adjacent to the face. This coefficient also contributes to calculating the source term of the matrix 
equation (the "vacuum" "zone" shows up as an explicit source for the adjacent real zones ). In 
the case of thermal conduction, one calculates the divergence twice: once to form the source 
for the matrix solver, and a second time afterwards to calculate a conservative energy transfer 
between all faces based on kV9 fluxes; for radiation, £^ is a conserved quantity already, so only 
the first, matrix-source divergence is necessary. 

Thus we can see that the optically thick limit of Spillman's variant allows the vacuum to 
radiate like a black body (more correctly, it radiates like the difference of two black bodies). F ■ 
dS « ±jAfacec{Ez — aT^), and compares well to the optically thin Milne result, ±^Afacec{Ez — 
aT^), and disagrees with the optically thick Marshak/Milne (or Dirichlet) result, ztAfacec{Ez — 
aT^)lTz ^ 0. 
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